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Preface 


This book is based on the notes of both authors for a course called “Higher 
Algebra,” a graduate level course. Its purpose was to offer the basic abstract algebra 
that any student of mathematics seeking an advanced degree might require. 
Students may have been previously exposed to some of the basic algebraic objects 
(groups, rings, vector spaces, etc.) in an introductory abstract algebra course such as 
that offered in the classic book of Herstein. But that exposure should not be a hard 
requirement as this book proceeds from first principles. Aside from the far greater 
theoretical depth, perhaps the main difference between an introductory algebra 
course, and a course in “higher algebra” (as exemplified by classics such as 
Jacobson’s Basic algebra [1, 2] and Van der Waerden’s Modern Algebra [3]) is an 
emphasis on the student understanding how to construct a mathematical proof, and 
that is where the exercises come in. 

The authors rotated teaching this one-year course called “Higher Algebra” at 
Kansas State University for 15 years—each of us generating his own set of notes 
for the course. This book is a blend of these notes. 

Listed below are some special features of these notes. 


1. (Combinatorial Background) Often the underlying combinatorial contexts— 
partially ordered sets etc.—seem almost invisible in a course on modern algebra. 
In fact they are often developed far from home in the middle of some specific 
algebraic context. Partially ordered sets are the natural context in which to 
discuss the following: 


(a) Zorn’s Lemma and the ascending and descending chain conditions, 

(b) Galois connections, 

(c) The modular law, 

(d) The Jordan Hélder Theorem, 

(e) Dependence Theories (needed for defining various notions of 
“dimension’’). 


The Jordan Hélder Theorem asserts that in a lower semimodular semilattice, any 
semimodular function from the set of covers (unrefinable chains of length one) 
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to a commutative monoid extends to an interval measure on all algebraic 
intervals (those intervals containing a finite unrefinable chain from the bottom to 
the top). The extension exists because the multiset of values of the function on 
the covers in any two unrefinable chains connecting a and b must be the same. 
The proof is quite easy, and the applications are everywhere. For example, when 
G is a finite group and P is the poset of subnormal subgroups, one notes that P is 
a semimodular lower semilattice and the reading of the simple group A/B of a 
cover A < B, is a semimodular function on covers by a fundamental theorem of 
homomorphisms of groups. By the theorem being described, this function 
extends to an interval measure with values in the additive monoid of multisets 
on the isomorphism classes of simple groups. The conclusion of the combina- 
torial Jordan-H6lder version in this context becomes the classical Jordan-H6lder 
Theorem for finite groups. One needs no “Butterfly Lemma” or anything else. 
(Free Groups) Often a free group on generators X is presented in an awkward 
way—by defining a “multiplication” on ‘reduced words’r(w), where w is a word in 
the free monoid M(X UX~'). ‘Reduced’ means all factors of the form xx~! have 
been removed. Here are the complications: First the reductions, which can often be 
performed in many ways, must lead to a common reduced word. Then one must 
show r(w 1 © w2) = r(r(w1) o r(w2)) to get “multiplication” defined on reduced 
words. Then one needs to verify the associative law and the other group axioms. 
In this book the free group is defined to be the automorphism group of a certain 
labelled graph, and the universal mapping properties of the free group are easily 
derived from the graph. Since full sets of automorphisms of an object always 
form a group, one will not be wasting time showing that an akwardly-defined 
multiplication obeys the axioms of a group. 


. (Universal Mapping Properties) These are always instances of the existence of 


an initial or terminal object in an appropriate category. 


. (Avoiding Determinants of Matrices) Of course one needs matrices to describe 


linear transformations of vector spaces, or to record data about bilinear forms 
(the Grammian). It is important to know when the rows or columns of a matrix 
are linearly dependant. One can calculate what is normally called the determi- 
nant by finding the invariant factors. For an n x n matrix, that process involves 
roughly n? steps, while the usual procedure for evaluating the determinant using 
Lagrange’s rule, involves exponentially many steps. 

One of the standard proofs that the trace mapping tr : K — F of a finite sepa- 
rable field extension F C K is nonzero proceeds as follows: First, one forms the 
normal closure L of the field K. One then invokes the theorem that L = F(@), a 
simple extension, with the algebraic conjugates of 0 as an F-basis of L. And then 
one reaches the conclusion by observing that a van der Monde determinant is 
non-zero. Perhaps it is an aesthetic quibble, but one does not like to see a nice 
“soft” algebraic proof about “soft” algebraic objects reduced to a matrix cal- 
culation. In Sect. 11.7 the proof that the trace is non-trivial is accomplished 
using only the Dedekind Independence Lemma and an elementary fact about 
bilinear forms. 
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In general, in this book, the determinant of a transformation T acting on an 
n-dimension vector space V is defined to be the scalar multiplication it induces 
on the n-th exterior product \”(V). Of course there are historical reasons for 
making a few exceptions to any decree to ban the usual formulaic definition of 
determinants altogether. Our historical discussion of the discriminant on page 
395 is such an exception. 


In addition, we have shaped the text with several pedagogical objectives in mind. 


1. (Catch-up opportunities) Not infrequently, the teacher of a graduate course is 
expected to accommodate incoming transfer students whose mathematical 
preparation is not quite the same as that of current students of the program, or is 
even unknown. At the same time, this accommodation should not sacrifice 
course content for the other students. For this this reason we have written each 
chapter at a gradient—with simplest explanations and examples first, before 
continuing at the level the curriculum requires. This way, a student may “catch 
up” by studying the introductory material more intensely, while a more brief 
review of it is presented in class. Students already familiar with the introductory 
material have merely to turn the page. 

2. (Curiosity-driven Appendices) The view of both authors has always been that a 
course in Algebra is not an exercise in cramming information, but is instead a 
way of inspiring mathematical curiosity. Real learning is basically 
curiosity-driven self-learning. Discussing what is is already known is simply 
there to guide the student to the real questions. For that reason we have inserted 
a number of appendices which are largely centered around incites connected 
with proofs in the text. Similarly, in the exercises, we have occasionally wan- 
dered into open problems or offered avenues for exploration. Mathematics 
education is not a catechism. 

3. (Planned Redundancy) Beside its role as a course guide, a textbook often lives 
another life as a source book. There is always the need of a student or colleague 
in a nearby mathematical field to check on some algebraic fact—say, to make 
sure of the hypotheses that accompany that fact. He or she does not need to read 
the whole book. But occasionally one wanders into the following scenario: one 
looks up topic A in the index, and finds, at the indicated page, that A is defined 
by further words B and C whose definition can be deciphered by a further visit 
to the index, which obligingly invites one to further pages at which the frus- 
tration may be enjoyed once again. It becomes a tree search. In order to intercept 
this process, we have tried to do the following: when an earlier-defined key 
concept re-inserts itself in a later discussion, we simply recall the definition for 
the reader at that point, while offering a page number where the concept was 
originally defined.’ Nevertheless we are introducing a redundancy. But in the 


‘if we carried out this process for the most common concepts, pages would be filled with re- 
definitions of rings, natural numbers, and what the containment relation is. Of course one has to 
limit these reminders of definitions to new key terms. 
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view of the authors’ experience in Kansas, redundancy is a valuable tool in 
teaching. Inspiration is useless if the student cannot first understand the words— 
and no teacher should apologize for redundancy. Judiciously applied, it does not 
waste class time; it actually saves it. 


Of course there are many topics—direct offshoots of the material of this 
course—that cannot be included here. One cannot do justice to such topics in a brief 
survey like this. Thus one will not find in this book material about 
(i) Representation Theory and Character Theory of Groups, (ii) Commutative Rings 
and the world of Ext and Tor, (iii) Group Cohomology or other Homological 
Algebra, (iv) Algebraic Geometry, (v) Really Deep Algebraic Number Theory and 
(vi) many other topics. The student is better off receiving a full exposition of these 
courses elsewhere rather than being deceived by the belief that the chapters of this 
book provide such an expertise. Of course, we try to indicate some of these points 
of departure as we meet them in the text, at times suggesting exterior references. 

A few words are inserted here about how the book can be used. 

As mentioned above, the book is a blend of the notes of both authors who 
alternately taught the course for many years. Of course there is much more in this 
book than can reasonably be covered in a two-semester course. In practice a course 
includes enough material from each chapter to reach the principle theorems. That is, 
portions of chapters can be left out. Of course the authors did not always present the 
course in exactly the same way, but the differences were mainly in the way focus 
and depth were distributed over the various topics. We did not “teach” the 
appendices to the chapters. They were there for the students to explore on their 
own. 

The syllabus presented here would be fairly typical. The numbers in parenthesis 
represent the number of class-hours the lectures usually consume. A two-semester 
course entails 72 class-hours. Beyond the lectures we normally allowed ourselves 
10-12 h for examinations and review of exercises. 


1. Chapter 1: (1 or 2) [This goes quickly since it involves only two easy proofs.] 
2. Chapter 2: (6, at most) [This also goes quickly since, except for three easy 
proofs, it is descriptive. The breakdown would be: (a) 2.2.1—2.2.9 (skip 2.2.10), 
2.2.10—2.2.15 (3 h), (b) 2.3 and 2.5 (2 h) and (c) 2.6 (1 h).] 
. Chapter 3: (3) 
. Chapter 4: (3) [Sometimes omitting 4.2.3.] 
. Chapter 5: (3 or 4) [Sometimes omitting 5.5.] 
. Chapter 6: (3) [Omitting the Brauer-Ree Theorem [6.4] but reserving 15 
minutes for Sect. 6.6.] 
7. Chapter 7: (3) Mostly examples and few proofs. Section 7.3.6 is often omitted.] 
8. Chapter 8: (7 or 8) [Usually (a) 8.1 (2 or 3 h), and (b) 8.2-8.4 (4 h). We 
sometimes omitted Sect. 8.3 if behind schedule. ] 
9. Chapter 9: (4) [One of us taught only 9.1-9.8 (sometimes omitting the local 
characterization of UFDs in 9.6.3) while the other would teach all of 9.9-9.12 
(Dedekind’s Theorem and the ideal class group.] 
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10. Chapter 10: (6) [It takes 3 days for 10.1—10.5. The student is asked to read 10.6, 
and 3 days remain for Sect. 10.7 and autopsies on some of the exercises.] 

11. Chapter 11: (11) [The content is rationed as follows: (a) 11.1-11.4 (2 h), 
sometimes omitting 11.4.2 if one is short a day, (b) 11.5-11.6 (3 h) (c) 11.7 
[One need only mention this.] (d) 11.8-11.9 (3 h) (e) [One of us would often 
omit 11.10 (algebraic field extensions are often simple extensions). Many insist 
this be part of the Algebra Catechism. Although the result is vaguely inter- 
esting, it is not needed for a single proof in this book.] (f) 11.11 (1 h) (g) [Then 
a day or two would be spent going through sampled exercises.]] 

12. Chapter 12: (5 or 6) [Content divided as (a) 12.1-12.3 (2 h) and (b) 12.4-12.5 
(2 h) with an extra hour wherever needed. | 

13. Chapter 13: (9 or 10) [Approximate time allotment: (a) 13.1-13.2 (1 h) (only 
elementary proofs here), (b) 13.3.1-13.3.2 (1 h), (c) 13.3.3-13.3.4 (adjunct 
functors) (1 h), (d) 13.4-13.5 (1 h), (e) 13.6-13.8 (1 or 2 h), (f) 13.9 C1 h), 13. 
10 (2 h) and 13.8 (3 h).] 


The list above is only offered as an example. The book provides ample “wiggle 
room” for composing alternative paths through this course, perhaps even 
re-arranging the order of topics. The one invariant is that Chap. 2 feeds all sub- 
sequent chapters. 

Beyond this, certain groups of chapters may serve as one semester courses on 
their own. Here are some suggestions: 

Group TuHeEory: Chaps. 3-6 (invoking only the Jordan Holder Theorem from 
Chap. 2). 

THeory OF Fietps: After an elementary preparation about UFD’s (their maximal 
ideals, and homomorphisms of polynomial rings in Chap. 6), and Groups (their 
actions, homomorphisms and facts about subgroup indices from Sects. 3.2, 3.3 and 
4.2) one could easily compose a semester course on Fields from Chap. 11. 

ARITHMETIC: UFD’s, including PID’s with applications to Linear Algebra using 
Chaps. 7-10. 

Basic Rinc Tueory: leading to Wedderburn’s Theorem. Chapters 7, 8 and 12. 

Rincs AND MobuLes, TENSoR Propucts AND MULTILINEAR ALGEBRA: Chaps. 7, 8 
and 13. 
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Chapter 1 
Basics 


Abstract The basic notational conventions used in this book are described. Compo- 
sition of mappings is defined the standard left-handed way: f g means mapping g was 
applied first. But things are a little more complicated than that since we must also deal 
with both left and right operators, binary operations and monoids. For example, right 
operators are sometimes indicated exponentially—that is by right superscripts (as 
in group conjugation)—or by right multiplication (as in right R-modules). Despite 
this, the “o”’-notation for composition will always have its left-handed interpreta- 
tion. Of course a basic discussion of sets, maps, and equivalence relations should 
be expected in a beginning chapter. Finally the basic arithmetic of the natural and 
cardinal numbers is set forth so that it can be used throughout the book without 
further development. (Proofs of the Schrdder-Bernstein Theorem and the fact that 
Xo - So = No appear in this discussion.) Clearly this chapter is only about everyone 
being on the same page at the start. 


1.1 Presumed Results and Conventions 


1.1.1 Presumed Jargon 


Most abstract algebraic structures in these notes are treated from first principles. Even 
so, the reader is assumed to have already acquired some familiarity with groups, 
cosets, group homomorphisms, ring homomorphisms and vector spaces from an 
undergraduate abstract algebra course or linear algebra course. We rely on these 
topics mostly as a source of familiar examples which can aid the intuition as well as 
points of reference that will indicate the direction various generalizations are taking. 


The Abstraction of Isomorphism Classes 


What do we mean by saying that object A is isomorphic to object B? In general, in 
algebra, we want objects A and B to be isomorphic if and only if one can obtain 
a complete description of object B simply by changing the names of the operating 
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parts of object A and the names of the relations among them that must hold—and 
vice versa. From this point of view we are merely dealing with the same situation, but 
under a new management which has renamed everything. This is the “alias” point of 
view; the two structures are really the same thing with some name changes imposed. 

The other way—the “alibi” point of view—is to form a one-to-one correspondence 
(a bijection) of the relevant parts of object A with object B such that a relation holds 
among parts in the domain (A) if and only if the corresponding relation holds among 
their images (parts of set B).! 

There is no logical distinction between the two approaches, only a psychologi- 
cal one. 

Unfortunately “renaming” is a subjective human conceptualization that is awk- 
ward to define precisely. That is why, at the beginning, there is a preference for 
describing an isomorphism in terms of bijections rather than “re-namings”, even 
though many of us secretly think of it as little more than a re-baptism. 

It is a standing habit in abstract mathematics for one to assert that mathematical 
objects are “the same” or even “equal” when one only means that the two objects 
are isomorphic. It is an abuse of language when we say that “two manifolds are the 
same’, “two groups are the same”, or that “A and B are really the same ring”. We 
shall meet this over and over again; for this is at the heart of the “‘abstractness” of 
Abstract Algebra.” 


1.1.2 Basic Arithmetic 


The integers are normally employed in analyzing any finite structure. Thus for ref- 
erence purposes, it will be useful to establish a few basic arithmetic properties of 
the integers. The integers enjoy the two associative and commutative operations of 
addition and multiplication, connected by the distributive law, that every student is 
familiar with. 

There is a natural (transitive) order relation among the integers: thus 


4<-3<«<-2<-1<0<1<2<3<-:--. 


If a < b, in this ordering, we say “integer a is less than integer b”. (This can also 
be rendered by saying “b is greater than a”’.) In the set Z of integers, those integers 
greater than or equal to zero form a set 


'We are deliberately vague in talking about parts rather than “elements” for the sake of generality. 


>There is acommon misunderstanding of this word “abstract” that mathematicians seem condemned 
to suffer. To many, “abstract” seems to mean “having no relation to the world—no applications”. 
Unfortunately, this is the overwhelming view of politicians, pundits of Education, and even many 
University Administrators throughout the United States. One hears words like “Ivory Tower’, “Intel- 
lectuals on welfare”, etc. On the contrary, these people have it just backwards. A concept is “abstract” 
precisely because it has more than one application—not that it hasn’t any application. It is very 
important to realize that two things introduced in distant contexts are in fact the same structure and 
subject to the same abstract theorems. 
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N = {0,1,2,...} 


called the natural numbers. Obviously, for every integer a that is not zero, exactly 
one of the integers a or —a is positive, and this one is denoted |a|, and is called the 
absolute value of a. We also define 0 to be the absolute value of itself, and write 
0 = |Oj. 

Of course this subset N inherits a total ordering from Z, but it also possesses a 
very important property not shared by Z: 


(The well-ordering property) Every non-empty subset of N possesses a least 
member. 


This property is used in the Lemma below. 
Lemma 1.1.1 (The Division Algorithm) Let a, b be integers with a & 0. Then there 


exist unique integers q (quotient) and r (remainder) such that 


b = qa+r, whereO <r <|al. 


Proof Define the set R := {b — qa|q € Z,b —qa => 0}; clearly R 4 WJ. Since 
the set of non-negative integers N is well ordered (See p. 34, Example 1), the set R 
must have a least element, call it r. Therefore, it follows already that b = gqa+r 
for suitable integers g,r and where r > 0. If it were the case that r > |a|, then 
setting r’ := r —|a|, one has r’ < r andr’ > 0, and yetb = qa+r = 
gat (r' + jal) = (¢#1)a +r’ (depending on whether a is positive or negative). 
Therefore, r’ = b — (q + 1)a € R, contrary to the minimality of r. Therefore, we 
conclude the existence of integers g, r with 


b = qa+r, where 0<r <|al, 


as required. 
The uniqueness of g, r turns out to be unimportant for our purposes; therefore we 
shall leave that verification to the reader. 


If n and m are integers, and if n € 0, we say that n divides m, and write n| m, if 
m = qn for some integer (possibly 0) q. If a, b are integers, not both 0, we call da 
greatest common divisor of a and b if 


(i) d>0, 
(ii) d|a and d|b, 
(iii) for any integer c satisfying the properties of d in (i), (11), above, we must have 
eld. 


Lemma 1.1.2 Let a, b be integers, not both 0. Then a greatest common divisor of a 
and b exists and is unique. Moreover, if d is the greatest common divisor of a and b, 
then there exist integers s and t such that 


d= sa-+tb (The Euclidean Trick). 
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Proof Here, we form the set D := {xa + yb| x, y € Z, xa + yb > O}. Again, it is 
routine to verify that D 4 J. We let d be the smallest element of D, and let s and 
t be integers with d = sa + tb. We shall show that d|a and that d| b. Apply the 
division algorithm to write 
a=qd+r, 0<r<d. 
If r > 0, then we have 
r=a-—qd=a-—q(sa+tb) = (1—qs)a—qtbe€ D, 

contrary to the minimality of d € D. Therefore, it must happen that r = 0, i.e., that 


d| a. In an entirely similar fashion, one proves that d| b. Finally, if c| a and c| b, then 
certainly c| (sa + tb), which says that c|d. 


As a result of Lemma 1.1.2, when the integers a and b are not both 0, we may 
speak unambiguously of their greatest common divisor d and write d = GCD(a, b). 
When GCD (a, b) = 1, we say that a and b are relatively prime. 

One final simple, but useful, number-theoretic result: 


Corollary 1.1.3. Let a and b be relatively prime integers with a # 0. If for some 
integer c, a| bc, then a\c. 


Proof By the Euclidean Trick, there exist integers s and t with sa + th = 1. Mul- 
tiplying both sides by c, we get sac + thc = c. Since a divides bc, we infer that a 
divides sac + tbc, which is to say that a|c. 


1.1.3 Sets and Maps 


1. Sets: Intuitively, a set A is a collection of objects. If x is one of the objects of 
the collection we write x € A and say that “x is a member of set A”. 
The reader should have a comfortable rapport with the following set-theoretic 
concepts: the notions of membership, containment and the operations of inter- 
section and union over arbitrary collections of subsets of a set. In order to make 
our notation clear we define these concepts: 


(a) If A and B are sets, the notation A C B represents the assertion that every 
member of set A is necessarily a member of set B. Two sets A and B are 
considered to be the same set if and only if every member of A is a member 
of B and every member of B is amember of A—thatis, A C Band BC A. 
In this case we write A = B.? 


3Of course the sets A and B might have entirely different descriptions, and yet possess the same 
collection of members. 
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(b) There is a set called the empty set which has no members. It is denoted 
by the universal symbol 4. The empty set is contained in any other set 
A. To prove this assertion one must show that x € J implies x € A. By 
definition, the hypothesis (the part preceding the word “implies”’) is false. 
A false statement implies any statement, in particular our conclusion that 
x € A. (This recitation reveals the close relation between sets and logic.) 
In particular, since any empty set is contained in any other, they are all 
considered to be “equal” as sets, thus justifying the use of one single symbol 
“OB”. 

(c) Similarly, if A and B are sets, the symbol A — B denotes the set {x € A|x ¢ 
B}, that is, the set of elements of A which are not members of B. (The reader 
is warned that in the literature one often encounters other notation for this 
set—for example “A\B”’. We will stick with “A — B”.) 

(d) If {Ac}ce; is a collection of sets indexed by the set J, then either of the 
symbols 

NgetAg or N{Aglo € T} 


denotes the set of elements which are members of each A, and this set is 
called the intersection of the sets {A,|o € J}. 
Similarly, either one of the symbols 


UgerAg or U{Aglo € T} 


denotes the union of the sets {A,|a¢ € 1}—namely the set of elements which 
are members of at least one of the A,. 
Beyond this, there is the special case of a union which we call a partition. 
We say that a collection 7 := {A,|o € I} of subsets of a set X is a partition 
of set X if and only if 
i. each A, isanon-empty subset of X (called a component of the partition), 
and 
ii. Each element of X lies in a unique component A,—that is, X = 
U{Aglo € I} and distinct components have an empty intersection. 


2. The Cartesian product construction, A x B: That would be the collection 
of all ordered pairs (a, b) (“ordered” in that we care which element appears 
on the left in the notation) such that the element a belongs to set A and the 
element b is a member of set B. Similarly for positive integer n we understand 
the n-fold Cartesian product of the sets B,,..., By to be the collection of all 
ordered sequences (sometimes called “n-tuples”), (bj, ..., b,) where, fori = 
1,2,...,m, the element b; is a member of the set B;. This collection of n-tuples 
is denoted 

By x---x By. 


3. Binary Relations: The student should be familiar with the device of viewing 
relations between objects as subsets of a Cartesian product of sets of these 
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objects. Here is how this works: Suppose R is a subset of the Cartesian product 

A x B. One uses this theoretical device to provide a setting for saying that object 

“a” in set A is related to object “b” in set B: one just says “element a is related 
to element “b” if and only if the pair (a, b) belongs to the subset R of A x B.* 
(This device seems adequate to handle any relation that is understood in some 
other sense. For example: the relation of being “first cousins” among members 
of the set P of living U.S. citizens, can be described as the set C of all pairs 
(x, y) in the Cartesian product P x P, where x is the first cousin of y.) 
The phrase “a relation on a set A” is intended to refer to a subset R of A x A. 
There are several useful species of such relations, such as equivalence relations, 
posets, simple graphs etc. 

4. Equivalence relations: Equivalence relations behave like the equal sign in ele- 
mentary mathematics. No one should imagine that any assertion that x is equal 
to y (an assertion denoted by an “equation x = y”) is saying that x really is y. 
Of course that is impossible since one symbol is one side of the equation and 
the other is on the other side. One only means that in some respect (which may 
be limited by an observer’s ability to make distinctions) the objects x and y do 
not appear to differ. It may be two students in class with the same amount of 
money on their person, or it may be two presidential candidates with equally 
fruitless goals. What we need to know is how this notion that things “are the 
same” operates. We say that the relation R (remember it is a subset of A x A) is 
an equivalence relation if an only if it obeys these three rules: 


(a) (Reflexive Property) For each a € A, (a, a) € R—that is, every element of 
A is R-related to itself. 

(b) (Symmetric Property) If (a, b) € R, then (b, a) € R—that is, if element a 
is related to b, then also element b is related to a. 

(c) (Transitive property) If a is related to b and b is related to c then one must 
have a related to c by the specified relation R. 


Suppose R is an equivalence relation on the set A. Then, for any element a € A, 
the set [a] of all elements related to a by the equivalence relation R, is called the 
equivalence class containing a, and such classes possess the following properties: 


(a) For each a € A, one has a € [a]. 
(b) For each b € [a], one has [a] = [b]. 
(c) No element of A — [a] is R-related to an element of [a]. 


4This is not just a matter of silly grammatical style. How many American Calculus books must 
students endure which assert that a “function” (for example from the set of real numbers to itself) 
is a “rule that assigns to each element of the domain set, a unique element of the “codomain” set? 
The “rules” referred to in that definition are presumably instructions in some language (for example 
in American English) and so these instructions are strings of symbols in some finite alphabet, 
syllabary, ideogramic system or secret code. The point is that such a set is at best only countably 
infinite whereas the collection of subsets R of A x B may well be uncountably infinite. So there is 
a very good logical reason for viewing relations as subsets of a Cartesian product. 
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It follows that for an equivalence relation R on the non-empty set A, the equiva- 
lence classes form the components of a partition of the set A in the sense defined 
above. Conversely, if 7 := {A,|o € J} is a partition of a set A, then we obtain a 
corresponding equivalence relation R, defined as follows: the pair (x, y) belongs 
to the subset R; C A x A—that is, x and y are R,-related—if and only if they 
belong to the same component of the partition 7. Thus there is a one-to-one 
correspondence between equivalence classes on a set A and the partitions of the 
set A. 

5. Partially ordered sets: Suppose a relation R on a set A, satisfies the following 
three properties: 


(a) (Reflexive Property) For each element a of A, (a, a) € R. 

(b) (Transitive property) If a is related to b and b is related to c then one must 
have a related to c by the specified relation R. 

(c) (Antisymmetric property) If (a, b) and (b, a) are both members of R, then 
a=b. 


A set A together with such a relation R is called a partially-ordered set or poset, 
for short. Partially ordered sets are endemic throughout mathematics, and are 
the natural home for many basic concepts of abstract algebra, such as chain 
conditions, dependence relations or the statement of “Zorn’s Lemma’’. Even the 
famous Jordan-Hélder Theorem is simply a theorem on the existence of interval 
measures in meet-closed semi-modular posets. 

One often denotes the poset relation by writing a < b, instead of (a,b) € R. 
Then the three axioms of a partially ordered set (A, <) read as follows: 


(a) x < x forallx € A. 
(b) Ifx < y and y < z, thenx < z. 
(c) Ifx < yand y < x, thenx = y. 


Note that the third axiom shows that the relations xj < x2 <--- <x, < x1 
imply that all the x; are equal. 
A simple example is the relation of “being contained in” among a collection 
of sets. Note that our definition of equality of sets, realizes the anti-symmetric 
property. Thus, if set A is contained in set B, and set B is contained in set A then 
the two sets are the same collection of objects—that is, they are equal as sets. 
6. Power sets: Given a set X, there is a set 2* of all subsets of X, called the power 
set of X. In many books, for example, Keith Devlin’s The Joy of Sets [16], the 
notation P(X) is used in place of 2*. In Example 2 on p. 35, we introduce 
this notation when we regard 2* as a partially ordered set with respect to the 
containment relation between subsets—at which point it is called the “power 
poset”. But in fact, virtually every time one considers the set 2*, one is aware of 
the pervasive presence of the containment relation, and so might as well regard 
it as a poset. Thus in practice, the two notations 2* and P(X) are virtually 
interchangeable. Most of the time we will use P(X), unless there is some reason 
not to be distracted by the containment relation or for the reason of a previous 
commitment of the symbol “P”. 
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Mappings: The word “mapping” is intended to be indistinguishable from the 
word “function” as it is used in most literature. We may define a mapping R : 
X — Yasasubset R C X xY with the property that for every x € X, there exists 
exactly one y € Y such that the pair (x, y) is in R. Then the notation y = R(x) 
simply means (x, y) € R and we metaphorically express this fact by saying 
“the function R sends element x to y”—as if the function was actively doing 
something. The suggestive metaphors continue when we also render this same 
fact—that y = R(x)—by saying that y is the image of element x or equivalently 
that x is a preimage of y. 


. Images and range: If f : X — Y is a mapping, the collection of all “images” 


J (x), as x ranges over X, is clearly a subset of Y which we call the image or 
range of the function f and it is denoted f(X). 

Equality of mappings: Two mappings are considered equal if they “do the same 
things”. Thus if f and g are both mappings (or functions) from X to Y we say 
that mapping f is equal to mapping g if and only if f(x) = g(x) for all x in X. 
(Of course this does not mean that f and g are described or defined in the same 
way. Asserting that two mappings are equal is often a non-obvious Theorem.) 
Identity mappings: A very special example is the following: The mapping 
ly : X — X which takes each element x of X to itself—i.e. f(x) = x—is 
called the identity mapping on set X. This mapping is very special and is uniquely 
defined just by specifying the set X. 

Domains, codomains, restrictions and extensions of mappings: In defining 
a mapping f : X — Y, the sets X and Y are a vital part of the definition of a 
mapping or function. The set X is called the domain of the function; the set Y is 
called the codomain of the mapping or function. 

A simple manipulation of both sets allows us to define new functions from old 
ones. For example, if A is a subset of the domain set X, and f : X > Yisa 
mapping, then we obtain a mapping 


fla: avy 


which sends every element a € A to f(a) (which is defined a fortiori). This new 
function is called the restriction of the function f to the subset A. If g = fla, 
we say that f extends function g. 

Similarly, if the codomain Y of the function f : X — Y is a subset of a set B 
(that is, Y C B), then we automatically inherit a function f|? : X —> B just 
from the definition of “function”. When f : X — Y is the identity mapping 
ly : X — X, the replacement of the codomain X by a larger set B yields a 
mapping 1x|? : X > B called the containment mapping. 

A mapping f : X — Y is said to be one-to-one or injective if any two dis- 
tinct elements of the domain are not permitted to yield the same image element. 


Note that there is no grammatical room here for a “multivalued function”. 


Unlike the notion of “restriction”, this construction does not seem to enjoy a uniform name. 
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This is another way of saying, that the fibre f~!(y) of each element y of Y is 
permitted to have at most one element. 

The reader should be familiar with the fact that the composition of two injec- 
tive mappings is injective and that the composition of two surjective mappings 
(performed in that chronological order) is surjective. She or he should be able 
to prove that the restriction f|4 : A — Y of any injective mapping f : X > Y 
(here A is a subset of X) is an injective mapping. 

One-to-one mappings (injections) and onto mappings (surjections): 

A mapping f : X —> Y is called onto or surjective if and only if Y = f(X) as 
sets. That means that every element of set Y is the image of some element of set 
X, or, put another way, the fibre f—!(y) of each element y of Y is nonempty. 
A mapping f : X — Y is said to be one-to-one or injective if any two distinct 
elements of the domain are not permitted to yield the same image element. This 
is another way of saying, that the fibre f~!(y) of each element y of Y is permit- 
ted to have at most one element. 

The reader should be familiar with the fact that the composition of two injec- 
tive mappings is injective and that the composition of two surjective mappings 
(performed in that chronological order) is surjective. She or he should be able 
to prove that the restriction f|4 : A — Y of any injective mapping f : X > Y 
(here A is a subset of X) is an injective mapping. 

Bijections: 

A mapping f : X — Y is a bijection if and only if it is both injective and 
surjective—that is, both one-to-one and onto. 

When this occurs, the fibre f~!(y) of every element y € Y contains a unique 
element which can unambiguously be denoted f—!(y). This notation allows us 
to define the unique function f~! : Y > X which we call the inverse of the 
bijection f. Note that the inverse mapping possesses these properties, 


f-'of =1y, and fo f—! = ly, 


where |x and ly are the identity mappings on sets X and Y, respectively. 
Examples using mappings: 


(a) Indexing families of subsets of a set X with the notation {XQ}, (or {Xa}aer) 
should be understood in its guise as a mapping J —> 2*.’ 

(b) The construction hom(A, B) as the set of all mappings A —~ B. (This is 
denoted B4 in some parts of mathematics.) The reader should see that if A is 
a finite set, say with n elements, then hom(A, B) is just the n-fold Cartesian 
product of B with itself 


Bx Bx.---x B (with exactly n factors). 


Note that in the notation, the “a” is ranging completely over J and so does not itself affect the 
collection being described; it is what logicians call a “bound” variable. 
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Also if N denotes the collection of all natural numbers (including zero), then 
hom(N, B) is the set of all sequences of elements of B. 

(c) Recall from an earlier item (p. 5) that a partition of a set X is a collection 

{Xq}, of non-empty subsets X, of X such that (i) UgerXq = X, and (ii) 
the sets Xq are pairwise disjoint—that is, X_ M Xg = whenever a # (2. 
The sets X,, of the union are called the components of the partition. 
A partition may be described in another way: as a surjection 7: X —> I. 
Then the collection of fibers—that is, the sets 7~! (a) := {x € X|n(x) = a} 
as a ranges over /—form the components of a partition. Conversely, if {Xq}1 
is a partition of X, then there is a well-defined surjection 7 : X —> I which 
takes each element of X to the index of the unique component of the partition 
which contains it. 


15. A notational convention on partitions: In these lecture notes if A and B are 
sets, we shall write X = A+ B (rather than X = AUB or X = AW B) to 
express the fact that {A, B} is a partition of X with just two components A and 
B. Similarly we write 

X=X,+X24+---+Xn 


when X possesses a partition with n components X;,i = 1,2,...,n. This 
notation goes back to Galois’ rendering of a partition of a group by cosets of 
a subgroup. The notation is very convenient since one doesn’t have to “doctor 
up” a “cup” (or “union’’) symbol. Unfortunately similar notation is also used in 
more algebraic contexts with a different meaning—for example as a set of sums 
in some additive group. We resolve the possible ambiguity in this way: 

When a partition (rather than, say, a set of sums) is intended, the partition will 
simply be introduced by the words “partition” or “decomposition”. 


1.1.4 Notation for Compositions of Mappings 


There is a perennial awkwardness cast over common mathematical notation for the 
composition of two maps. Mappings are sometimes written as left operators and 
sometimes as right operators, and the awkwardness is not the same for both choices 
due to the asymmetric fact that English is read from left to right. Because of this, 
right operators work much better for representing the action of sets with a binary 
operation as mappings with compositions. 

Then why are left operators used at all? There are two answers: Suppose there is 
a division of the operators on X into two sets—say A and B. Suppose also that if 
an operator a € A is applied first in chronological order and the operator b € B is 
applied afterwards; that the result is always the same had we applied b first and then 
applied a later. Then we say that the two operations “commute” (at least in the time 
scale of their application, if not the temporal order in which the operators are read 
from left to right). This property can often be more conveniently rendered by having 
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one set, say A, be the left operators, and the other set B be the “right” operators, 
and then expressing the “commutativity“as something that looks like an “associative 
law”. So one needs both kinds of operators to do that. 

The second reason for using left operators is more anthropological than mathe- 
matical. The answer comes from the sociological accident of English usage. In the 
English phrase “function of x”, the word “function” comes first, and then its argu- 
ment. This is reflected in the left-to-right notation “ f(x)” so familiar from calculus. 

Then if the composition a o ( were to mean “a is applied first and then ( is 
applied”, one would be obligated to write (a o 3)(x) = B(a(x)). That is, we must 
reverse their “reading” order. 

On the other hand, if we say a o @ means (3 is applied first and a is applied 
second—so (a o 3)(x) = a(G(x))—then things are nice as far as the treatment 
of parentheses are concerned, but we still seem to be reading things in the reverse 
chronological order (unless we compensate by reading from right to left). Either way 
there is an inconvenience. 

In the vast majority of cases, the Mathematical Literature has already chosen the 
latter as the least of the two evils. Accordingly, we adopt this convention: 


Notation for Composition of Mappings: Ifa: X — Y and: Y > Z, then 
30a denotes the result of first applying mapping a to obtain an element y of Y, 
and then applying mapping (3 to y. Thus if the mappings a and ( are regarded as 
left operators of X, and Y, respectively, we have, for each x € X, 


(G0 a)(x) = B(a(x)). 


But if a and 3 are right operators on X and Y, respectively, we have, for each 
xe xX, 


x(8 0 a) = (xa)f. 


But right operators are also very useful. A common instance is when (i) the set 
F consists of functions mapping a set X into itself, and (ii) F itself possesses an 
associative binary operation “x” (see Sect. 1.4) and (iii) composition of two such 
functions is the function representing the binary operation of them—that is f * g 
acts as go f. In this case, it is really handier to think of the functions as right operators, 
so that we can write 
(xf)9 = x FI, 


for all x € X and f and g in F. For this reason we tend to view the action of groups 
or rings as induced mappings which are right operators. 

Finally, there are times when one needs to discuss “morphisms” which commute 
with all right operators in F’. It is then easier to think of these morphisms as left 
operators, for if the function a commutes with the right operator g, we can express 
this by the simple equation 


a(x?) = (a(x))%, for all x € X and g in F. 
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So there are cases where both types of operators must be used. 
The convention of these notes is to indicate right operators by exponential notation 
or with explicit apologies (as in the case of right R-modules), as right multiplications. 
Thus in general we adopt these rules: 


Rule #1: The symbol a o ( denotes the composition resulting from first applying 
(6 and then a in chronological order. Thus 


(wo B)(x) = a(G(x)). 


Rule #2: _ Exponential notation indicates right operators. Thus compositions have 
these images: 


Exception to Rule #2: For right R-modules we indicate the right operators by right 
“multiplication”, that is right juxtaposition. Ring multiplication still gets repre- 
sented the right way since we have (mr)s = m(rs) for module element m and 
ring elements r and s (it looks like an associative law). (The reason for eschewing 
the exponential notation in this case is that the law m’*’ = m" + m° for right 
R-modules would then not look like a right distributive law.) 


1.2 Binary Operations and Monoids 


It is not our intention to venture into various algebraic structures at such an early 
stage in this book. But we are forced to make an exception for monoids, since they 
are always lurking around so many of the most basic definitions (for example, the 
definition of interval measures on posets). 
Suppose X is a set and let X”) be the n-fold Cartesian product of X with itself. 
For n > 0, a mapping 
xo x 


is called an n-ary operation on X. If n = 1 such an operation is just a mapping of 
X into itself. There are certain concepts that are brought to bear at the level of 2-ary 
(or binary) operations that fade away for larger n. 

We say that set X admits a binary operation if there exists a 2-ary function 
f :X x X — X.In this case, it is possible to indicate the operation by a constant 
symbol (say “’’) inserted between the elements of an ordered pair—thus one might 
write “x « y” to indicate f((x, y)) (which we shall write as f(x, y) to rid ourselves 
of one set of superfluous parentheses). 
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Indeed one might even use the empty symbol in this role—that is f(x, y) is 
represented by “juxtaposition” of symbols: that is we write “xy” for f(x, y). The 
juxtaposition convention seems the simplest way to describe properties of a general 
but otherwise unspecified binary operation. 

The binary operation on X is said to be commutative if and only if 


xy = yx, forall x and yin X. 
The operation is associative if and only if 
x(yz) = (xy)z for all x, y,z € X. 


Again let us consider an arbitrary binary operation on X. Do not assume that it 
is associative or commutative. The operation admits a left identity element if there 
exists an element—say e; in X such that e,x = x for all x € X. Similarly, we say 
the operation admits a right identity element if there exists an element er such that 
xer = x forall x € X. However, if X admits both a left identity element and a right 
identity element, say ey, and ep, respectively, then the two are equal for 


eR =e_er=eL. 


(The first equality is from e,, being a left identity, and the second is from er being a 
right identity.) We thus have 


Proposition 1.2.1 Suppose X is a set admitting a (not necessarily associative) 
binary operation indicated by juxtaposition. Suppose this operation on X admits 
at least one left identity element and at least one right identity element (they need not 
be distinct elements). Then all right identity elements and all left identity elements 
are equal to a unique element e for which ex = xe = x forall x € X. (Such an 
element e is called an identity element or a two-sided identity for the given binary 
operation on X.) 


A set admitting an associative binary operation is called a semigroup. For example 
if X contains more than two elements and the binary operation is defined by xy = y 
for all x, y € X, then with respect to this binary operation, X is a semigroup with 
many left identity elements and no right identity elements. 

A semigroup with a two-sided identity is called a monoid. A semigroup (or 
monoid) with respect to a commutative binary operation is simply called a com- 
mutative semigroup (or commutative monoid). 

We list several commonly encountered monoids. 


1. The set N of non-negative integers (natural numbers) under the operation of 
ordinary addition. 

2. Let X be any set. We let M(X) be the set of all finite strings (including the empty 
string) of the elements of X. A string is simply a sequence of elements of X. It 
becomes a monoid under the binary operation of concatenation of strings. The 


1 Basics 


concatenation of strings is the string obtained by extending the first sequence by 
adjoining the second one as a “suffix”. Thus ifs; = (x1, x2, x3) and s2 = (1, y2), 
then the concatenation would be sy * s2 = (%1, x2, X3, 1, y2). (Note that con- 
catenation is an associative operation, and that the empty string is the two sided 
identity of this monoid.) For reasons that will be made clear in Chaps. 6 and 7, 
it is called the free monoid on the set X. 


. A multiset of the set X is a function f : X — N that assigns to each element of 


X anon-negative integer. In this sense a multiset represents a sort of inventory of 
objects of various types drawn from a set of types X. The collection of all multisets 
on set X, denoted M(X), admits a commutative binary operation which we call 
“addition”. If f, g : X — N are two multisets, their sum f + g is defined to be the 
function that sends x to f(x) + g(x). In the language of inventories, addition of 
two multisets is just the merging of inventories. Since this addition is associative 
and the empty multiset (the function with all values zero) is a two-sided identity, 
the multisets on X form a monoid (M(X), +), with respect to addition. This 
monoid is also called the free commutative monoid on set X, an appellation fully 
justified in Chap. 7. 

A multiset f is finite if the set of elements x at which the function f assumes 
a positive value (called the support of the function) is a finite set. By setting 
f@) = a, i € N, we can write any finite multiset as a countable sequence 
(ao, a1, ...) of natural numbers which has only finitely many non-zero entries. 
Addition of two multisets (ag, ...) and (bo, ...) is then performed coordinate- 
wise. We denote this submonoid of finite multisets of elements chosen from X 
by the symbol, (M2. (X), +). If the set X is itself finite, then, of course, all 
elements of MM(X) are finite multisets. 


1.3 Notation for Special Structures 


There are certain sets and structures which are encountered over and over, and these 


will have a special fixed notation throughout this book. 


Standard Sets 


1. 


2: 


3. 
4. 
oa 


N. The system of natural numbers, {0,1,2,...}. It is important for the student 
to realize that this term is understood here to include the integer zero. It is a 
well-ordered poset with the descending chain condition (see Chap. 2). 

Z. This is the collection of all integers, {..., —2, —1,0, 1, 2,...}. It forms an 
integral domain under the operations of addition and multiplication (Chap. 7). 
Q. The field of rational numbers. 

R. The field of real numbers. 

C. The field of complex numbers. 
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1.4 The Axiom of Choice and Cardinal Numbers 


1.4.1 The Axiom of Choice 


The student ought to be acquainted with and to be able to use 

THE AXIOM OF CHOICE: Suppose {Xq}7 is a family of pairwise disjoint non- 
empty sets. Then there exists a subset R of the union of the Xq, which meets each 
component X q in exactly a one-element set. 

This assertion is about the existence of systems of representatives. If only finitely 
many of the X,q, are infinite sets this can be proved from ordinary set theory. But the 
reader should be aware that in its full generality it is independant of set theory, and 
yet, is consistent with it. The reader is thus encouraged to think of it as an adjunct 
axiom to set theory, to make a note of each time it is used, and to quietly produce the 
appropriate guilt feelings when using it. 

The Axiom of Choice has many uses. For example it guarantees the existence of a 
system of coset representatives for any subgroup of any group. In fact we shall see an 
application of the axiom of choice in the very next subsection on cardinal numbers. 

In the presence of set theory (which is actually ever-present for the purposes of 
these notes) the Axiom of Choice is equivalent to another assertion called Zorn’s 
Lemma, which should also be familiar to the reader. Since it appears in the setting 
of partially ordered sets, a full discussion of Zorn’s Lemma is deferred to a section 
of the next chapter. 

The reader is not required to know a proof of the equivalence of the Axiom of 
Choice and Zorn’s Lemma, or a proof of their consistency with set theory. At this 
stage, all that is required to read these notes is the psychological assurance that one 
cannot “get into trouble” by using these principles. For a good development of the 
many surprising equivalent versions of the Axiom of Choice the curious student is 
encouraged to peruse Sect.2.2.7 of the book by Keith Devlin entitled The Joy of 
Sets [16]. 


1.4.2 Cardinal Numbers 


Not all collections of things are actually amenable to the axioms of set theory, as 
Russell’s paradox illustrates. Nonetheless certain operations and constructions on 
such collections can still exist. It is still possible that they may possess equivalence 
relations and that is true of the collection of all sets. 

We have mentioned that a mapping f : A —> B which is both an injection and 
a surjection is called a bijection or a one-to-one correspondence in a slightly older 
language.® In that case the partition of A defined by the surjection f (see above) is 
the trivial partition of A into its one-element subsets. This means the fibering f~! 


8 “one-to-one correspondence” is not to be confused with the weaker notion of a “one-to-one 
mapping” introduced on p. 9. The latter is just an injective mapping which may or may not be a 
bijection. 
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defines an inverse mapping f~! : B —> A which is also a bijection satisfying 
fof 7! = 1, and f~!o f = 1g as described in item 15 on p. 9. It is clear, 
using obvious facts about compositions of bijections and identity mappings, that the 
relation of two sets having a bijection connecting them is an equivalence relation. 
The resulting equivalence classes are called cardinal numbers and the equivalence 
class containing set X is denoted |X| and is called the cardinality of X. 

Cardinal numbers possess an inherent partial ordering. One writes |X| < |Y| if 
and only there exists an injection f : X —> Y.? We are obligated to show three 
things: 


1. The relation “<” is well-defined—that is, if, for sets X, Y and Z, we have that 
|X| < |¥| and |Y| = |Z], then |X| < |Y|. 

2. The relation “<” is transitive. 

3. The relation “<” is anti-symmetric. 


First we observe that our definitions force the transitive law (item 2). Suppose, 
for sets X, Y and Z, one has |X| < |Y| and |Y| < |Z|. Then from our definition of 
“<” there exist injective mappings f : X > Y,andg:Y — Z.Thengo f is an 
injective mapping from X to Z, and so by definition, |X| < |Y|. 

Next the student may observe that |Y| = |Z| implies |Y| < |Z], for the former 
statement implies a bijection g : Y > Z, and as any bijection is injective, |Y| < |Z| 
by definition. This observation together with the transitive law implies the statement 
of item 2. 

For the anti-symmetric law we appeal to a famous Theorem: 


Theorem 1.4.1 (The Schréder-Bernstein Theorem) /f, for two sets X and Y, one 
has 
|X| < |¥| and |Y| < |X|, 


then |X| = |Y]. 


Proof By hypothesis there are two injective (one-to-one) mappings f : X — Y 
and g: Y — X. Our task is to use this data to devise a bijective (one-to-one onto) 
mappingh: X > Y. 

We may assume that neither of the injective mappings f or g is surjective (onto), 
for otherwise either f or g~! will serve as our desired mapping h. 

As aresult, f(X) is a proper subset of Y and, as g is injective, g(f(Y) = (g o 
f)() is a proper subset of g(Y). In this way, one obtains a properly descending 
chain of subsets: 


XDIV)D(GOPMMIGofogM d-:-. (1.1) 


Transposing the roles of f and g presents a second properly descending chain: 


°It should be clear to the student that this partial ordering is on the collection of cardinal numbers. 
It is not a relation between the sets themselves. 
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YO FN) (fom) a Fogo fi X)a-. (1.2) 


Anelementa € X UY is said to be an ancestor of element b € Y UY if and only if 
there is a finite string of alternate applications of the mappings f and g which, when 
applied to a, yields b. For example, ifa, b € X,and (go- --o f)(a) = b, for some finite 
string go---o f, then a is an ancestor of b. Thus, the non-empty set Xo := X — g(Y) 
are the members of X which have no ancestors; the set X; := g(Y) — (go f)(X) 
comprise the members of X with exactly one ancestor. We let X; be denote the set 
of elements of X with exactly k ancestors—namely, the non-empty set 


Xe = 9 0(f og)* YP) — (go fy*tY7(X), if k is odd, or 
Xe = (go f)/?(X) — go (f og)*/?(¥), if k is even. 


The symbol Xoo will denote the set of elements of X which possess infinitely 
many ancestors. If the intersection of the sets in the tower of Eq. (1.1) is empty, then 
there are no elements with infinitely many ancestors. Thus, unlike the X;, the set 
Xoo could be empty. 

Next we similarly define the non-empty sets Y;, as members of Y with exactly 
k ancestors, and let Y4. denote the set of those members of Y with infinitely many 
ancestors. Now if an element z of X (or Y) possesses infinitely many ancestors, then 
so does g lz and f(z) (or Ff 1G) and g(z) when z € Y). Thus f restricted to Xo 
induces a bijection hog : Xoo — Yoo whether the sets are empty or not. It remains 
only to devise a bijection X — Xo > Y — Yoo. 

We now have two partitions (into non-empty sets): 


X-— Xo = XotXi+X24+--: 
Y-Yo=YotYitY2+-::: 


If k is even, define hy : X; — Yx+41 as the restriction of the mapping f to the 
subset Xx. Note that hx is surjective, and so is a bijection. If k is odd, then X, lies 
in g(Y) and so the inverse mapping g~! may be applied to it, to produce a mapping 
hy : X_ — Yx_1, that is surjective and injective since g was a mapping. Thus h, is 
a bijection in this case as well. 

Nowh : X —> Y is defined by h : x b> hy(x) if x © Xz, andx hH ha(x) if 
x € Xoo. Since the hx are all bijections and the codomains of the hx reproduce the 
partition of Y — Yoo given above, h is our desired bijection. 


Remark This elementary proof due to J. K6nig may be found in the book of P.M. 
Cohn entitled Algebra, vol. 2 [10, p. 11] and in the book of Birkhoff and McLane [8]. 


These lecture notes presume and use two further results concerning cardinalities 
of sets: 


Theorem 1.4.2 /f there exists a surjection f : X —> Y then |Y| < |X|. 
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Proof With the axiom of choice one gets a subset X’ containing exactly one element 
Xy from each fiber f ~'(y) as y ranges over Y. One then has an injection Y — X’ 
taking y to x,. We can then compose this with the inclusion mapping i : X' — X to 
obtain the desired injection from Y into X. 


As usual, let N denote the set of natural numbers, that is, the set {0, 1,2, ...} of 
all integers which are positive or zero (non-negative). 

A cardinal number is defined to be the name of a cardinality equivalence class. 
For a finite set F’, the cardinality | F'| is simply a natural number. Thus, the cardinality 
of the empty set J is 0, and for non-empty finite sets, this association with natural 
numbers seems perfectly natural since any finite set is bijective with some finite initial 
segment—say (1, 2, ...”)—of the positive integers listed in their natural ordering. 
Indeed, producing that bijection is what we usually refer to as “counting”. 

One can define a product of two cardinal numbers, in the following way: If a = 
|A| and b = |B| are two cardinal numbers, (A and B chosen representatives from 
the equivalence classes of sets denoted by a and b, respectively), then one writes 
ab = |A x B\, the cardinality of the Cartesian product of A and B. This product 
is well-defined, for if one selected other representatives A’ and B’ of these classes, 
there are then bijections a: A — A’ and G : B — B’ which can be used to define a 
bijection A x B > A’ x B’ defined by 


(a, b) + (a(a), B(b)), for all (a,b) € A x B. 
Similarly, the mapping 
((a, b),c) (a, (b,c)), forall (a,b,c) Ee Ax BxC 


defines a bijection (A x B) x C — Ax (Bx C). Thus we see that taking the product 
among cardinal numbers is an associative operation. 

The reader will appreciate that this definition of product is completely compatible 
with the definition of multiplication of positive integers familiar to most children. If 
A contains three elements, and B contains seven, then the Cartesian product A x B 
contains twenty-one distinct pairs. However now, our definition can be applied to 
cardinalities of infinite sets, as well. 

The simplest infinite set familiar to the young student is the set of natural numbers 
itself. Custom has assigned the rather unique symbol No for the cardinality of N. Any 
set is said to be countably infinite, if its cardinality is So—or equivalently, there is a 
bijection taking such a set to N. 

Now a property which distinguishes infinite sets from finite sets is that an infinite 
set can be bijective with a proper subset of itself. For example, N is bijective with 
the non-negative even integers. It is also bijective with all of the natural numbers that 
are perfect squares. Similarly, the integers Z are bijective with N by the mapping 
defined by 
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nt+> 2nforn Ee N 


nt+> 2n—1forneN,n>0. 


It is also easy to see this: 


Theorem 1.4.3 [fk is a positive integer, then the cardinality of the union of k disjoint 
copies of N is |N| = Xo. 


Proof This is left as an exercise. 


Now something peculiar happens: 


Theorem 1.4.4 Xo - Xo = Xo 


Proof We must produce a bijection f : N > N x N. First we assign the pair 
(a, b) € Nx Nto the point in the real (Cartesian) plane with those coordinates. All the 
integral coordinates in and on the boundary of the first quadrant now represent points 
of N x N. Now partition the points into non-empty finite sets according to the sum of 
their coordinates. First comes (0, 0), then {(0, 1), (1, 0)}, then {(0, 2), (1, 1), (2, 0)}, 
and so on. Having ordered the points within each component of the partition by the 
natural ordering of its first coordinate, we obtain in this way, a sequence S indexed 
by N. Mapping the n-th member of this sequence to n — | produces a bijection 
g:NxNON. 


Corollary 1.4.5 |N| = |Z x Zl. 


It is left as an exercise, to prove that |N| = |QJ|, where, as usual, Q is the set of 
rational numbers—the fractions formed from the integers. 

In the next chapter, we shall generalize Theorem 1.4.4 by showing that for any 
infinite cardinal number a, one has Xo - a = a. 
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Chapter 2 
Basic Combinatorial Principles of Algebra 


Abstract Many basic concepts used throughout Algebra have a natural home in 
Partially Ordered Sets (hereafter called “posets”). Aside from obvious poset resi- 
dents such as Zorn’s Lemma and the well-ordered sets, some concepts are more 
wider roaming. Among these are the ascending and descending chain conditions, 
the general Jordan-Hélder Theorem (seen here as a theorem on interval measures 
of certain lower semillattices), Galois connections, the modular laws in lattices, and 
general independence notions that lead to the concepts of dimension and transcen- 
dence degree. 


2.1 Introduction 


The reader is certainly familiar with examples of sets X possessing a transitive 
relation “<” which is antisymmetric. Such a pair (X, <) is called a partially ordered 
set—often abbreviated as poset. For any subset Y of the poset X there is an induced 
partial ordering (Y, <) imposed on Y by the partially ordered set (X, <) which 
surrounds it: one merely restricts the relation “<” to the pairs of Y x Y. We then call 
(Y, <) an induced poset of (X, <). 

Certainly one example of a partially ordered set familiar to most readers is the poset 
P(X) := (2%, C) called the power poset of all subsets of X under the containment 
relation. 

Throughout this algebra course one will encounter sets X which are closed under 
various n-ary operations subject to certain axioms—that is, some species of “alge- 
braic object”. Each such object X naturally produces a partially ordered set whose 
members are the subsets of X closed under these operations—that is, the poset of 
algebraic subobjects of the same species. For example, If X is a group, then the poset 
of algebraic subobjects is the poset of all subgroups of X. If R is a ring and M isa 
right R-module, then we obtain the poset of submodules of M. Special cases are the 
posets of vector subspaces of a right vector space V and the poset of right ideals of 
aring R. 

In turn these posets have special induced posets: Thus the poset L<oo(V) of all 
finite-dimensional vector subspaces of a (possibly infinite-dimensional) vector space 


© Springer International Publishing Switzerland 2015 21 
E. Shult and D. Surowski, Algebra, DOI 10.1007/978-3-319-19734-0_2 


22 2 Basic Combinatorial Principles of Algebra 


V is transparently a subposet of the poset of all vector subspaces of V. For example 
from a group G one obtains the poset of normal subgroups of G, or more generally, 
the poset of subgroups invariant under any fixed subgroup A of the automorphism 
group of G. Similarly there are posets of invariant subrings of polynomial rings, and 
invariant submodules for an R-module admitting operators. Finally, there are induced 
posets of algebraic objects which are closed with respect to a closure operator on a 
poset (perhaps defined by a Galois connection). 

All of these examples will be made precise later. The important thing to note at 
this stage is that 


1. Partially ordered sets underly all of the algebraic structures discussed in this book. 

2. Many of the crucial conditions which make arguments work are basically proper- 
ties of the underlying posets alone and do not depend on the particular algebraic 
species within which one is working: Here are the main examples: 


(a) Zorn’s Lemma, 

(b) the ascending and descending chain conditions, 

(c) Galois connections and closure operators, 

(d) interval measures on semimodular semilattices (The General Jordan-H6lder 
Theorem), and 

(e) dependence theory (providing the notion of “dimension’’). 


The purpose of this chapter is to introduce those basic arguments that arise strictly 
from the framework of partially ordered sets, ready to be used for the rest of this 
book. 


2.2 Basic Definitions 


2.2.1 Definition of a Partially Ordered Set 


A partially ordered set, (P, <), hereafter called a poset, is a set P with a transitive 
antireflexive binary relation <. This means that for all elements x, y and z of P, 


1. (transitivity) x < y and y < z together imply x < z, and 
2. (antireflexivity or antisymmetry)! the assertions x < y and y < x together imply 
x=y. 


It is often useful to view elements of a poset pictorially, as if they were vertices 
placed in vertical plane. Thus we say “x is below y” or “y is above x” if x < y in 
some poset.” 


'In the literature on binary relations, the term “antisymmetric” often replaces its equivalent “antire- 
flexive”. 


?This is just metaphorical language, nothing more. 
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We write y > x ifx < y and write x < yifx < yandx 4 y. In the latter case 
we Say that x is properly below y, or, equivalently, that y is properly above x. 


2.2.2 Subposets and Induced Subposets 


Now let (P, <) be a poset, and suppose X is a subset of P with a relation <x for 
which (X, <x) is a partially ordered set. In general, there may be no relation between 
(X, <x) and the order relation < that the elements of X inherit from (P, <). But if 
it is true that for any x and y in X, 


x <y yimplies x < y, (2.1) 


then we say that (X, <x) is a subposet of (P,<). Thus in a general subposet it 
might happen that two elements x; and x2 of (X, <x) are incomparable with respect 
to the ordering <x even though one is bounded by the other (say, x; < x2) in the 
ambient poset (P, <). 

However, if the converse implication in Eq. (2.1) holds for a subposet, we say that 
poset is an induced subposet. Formally, (X, <y) is defined to be an induced subposet 
of (P, <), if and only if X C P and for any x and y in X, 


x <y yifand only ifx < y. (2.2) 


Suppose (P, <) is a poset, and X is a subset of P. Then we can agree to induce the 
relation < on the subset X—that is, for any two elements x and y of the subset X, we 
agree to say that x <y y if and only if x < y in the poset (P, <). Thus an induced 
subposet of (P, <) is entirely determined once its set of elements is specified. 

The empty set is considered to be an induced subposet of any poset. 

Let X and Y be two subsets of P where (P, <) is a poset. We make these obser- 
vations: 


1. If (X, <x) is a partial ordering on X and if (Y, <y) is a partial ordering on Y 
such that (X, <x) is an induced subposet of (Y, <y) and (Y, <y) is an induced 
subposet of (P, <), then (X, <x) is also an induced subposet of (P, <). This fact 
means that if X is any subset of P, and (P, <) is a poset, we don’t need those 
little subscripts attached to the relation “<” any more: we can unambiguously 
write (X, <) to indicate the subposet induced by (P, <) on X. In fact, when it is 
clear that we are speaking of induced subposets, we may write X for (X, <) and 
speak of “the induced poset X”’. 


24 2 Basic Combinatorial Principles of Algebra 


2. If (X, <) and (Y, <) are induced subposets of (P, <) then we can form the 
intersection of induced posets (X 1 Y, <). In fact, since this notion depends only 
on the underlying sets, one may form the intersection of induced posets 


(Qs) 


from any family {(X,, <)|o € J} of induced posets of a poset (P, <). 


Perhaps the most important example of an induced poset is the interval. Suppose 
(P, <) is a poset and that x < y for elements x, y € P. Consider the induced poset 


Ix, y]p:={ze Plx<z< y}. 


This is called the interval in (P, <) from x to y. If (P, <) is understood, we would 
write [x, y] in place of [x, y]p. But we must be very careful: there are occasions in 
which one wishes to discuss intervals within an induced poset (X, <), in which case 
one would write [x, y]y for the elements z of X between x and y. Clearly one could 
mix the notations and write [x, y]x := [x, y] 1 X, the intersection of two induced 
posets. 


2.2.3 Dual Posets and Dual Concepts 


Suppose now that (P, <) is given. One may obtain a new poset, (P*, <*) whose 
elements are exactly those of P, but in which the partial ordering has been reversed! 
Thus a < b in (P, <) if and only if b <* a in (P*, <*). In this case, (P*, <*) is 
called the dual poset of (P, <). 

We might as well get used to the idea that for every definition regarding (P, <), 
there is a “dual notion’”—that is, the generically-defined property or set in (P, <) 
resulting from defining the same property or set in (P*, <*). Examples will follow. 


2.2.4 Maximal and Minimal Elements of Induced Posets 


An element x is said to be maximal in (X, <) if and only if there is no element in X, 
which is strictly larger than x—that is x < y for y € X implies x = y. 

Note that this is quite different than the more specialized notion of a global 
maximum (over X) which would be an element x in X for which x’ < x for all 
elements x’ of X. Of course defining something does not posit its existence; (X, <) 
may not even possess maximal elements, or if it does, it may not contain a global 
maximum. 
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We let max X denote the full (but possibly empty) collection of all maximal 
elements of (X, <). 

By replacing the symbol “<” by “>” in the preceding definitions, we obtain the 
dual notions of a minimal element and a global minimum of an induced poset (X, <). 

Of course some posets may contain no minimal elements whatsoever. 


2.2.5 Global Maxima and Minima 


If a poset (P, <) possesses a global minimum, then, by the antisymmetric property, 
that element is the unique global minimum, and so deserves a special name. We call 
it the zero-element of the poset. 

Dually, there may exist a global maximum (a “one-element’, denoted 1p, or 
something similar) which is an element in (P, <) with the property that all other 
elements are less than or equal to it. Obviously this 1p is the zero-element of the 
dual poset, (P*, <*). 

Some posets have a “zero”, some have a “one”, some have both, and some have 
neither. 

But whether or not a “zero” is present in P, one can always adjoin a new element 
declared to be properly below all elements of P to obtain a new poset 0(P). For 
example, if P contains just one element p, then 0(P) consists of just two elements 
{01, p} with 0} < p—called a chain of length one. Iterating this construction with 
the same meaning of P, we see that 0?(P) introduces a “new zero element”, 02, to 
produce a 3-element poset with 02 < 0; < p—that is, a chain of length two. Clearly 
0«(P) would be a poset with elements (up to a renaming of the elements) arranged 
as 

On < Op-1 <--+- < 01 < p, 


which we call a chain of length k. Of course this construction can also be performed 
on any poset P so that 0<(P) in general becomes the poset P with a tail of length 
k — 1 adjoined to it from below—a sort of attached “kite-tail”. 

Also, by dually defining 1‘(P) one can attach a “stalk” of length k — 1 above an 
arbitrary poset P. 


2.2.6 Total Orderings and Chains 


A poset (P, <) is said to be totally ordered if {c,d} C C implies c < d ord <c. 
Sometimes this notion is referred to as “simply ordered”. Obviously any induced 
poset of a totally ordered set is also totally ordered. Any maximal (or minimal ele- 
ment) of a totally ordered set is in fact a global maximum (or global minimum). 
Familiar examples of totally ordered sets are obtained as induced posets of the 
real numbers under its usual ordering, for example (1) the rational numbers, (2) 
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the integers, (3) the positive integers (4) the open, half-closed or closed intervals 
(a, b), [a, b), (a, b], ora, b] of the real number system, or (5) the intersections of 
sets in (4) with those in (1)-(3). 

A chain of poset (P, <) is an induced subposet (C, <) which happens to be totally 
ordered. 


2.2.7 Zornification 


Although an induced poset (X, <) of a poset (P, <) may not have a maximum, it 
might have an upper bound—that is, an element m in P which is larger than anything 
in the subset X (precisely rendered by “‘m > x for all elements x of X”’). (Of course 
if it happened that such an element m were already in X then it would be a global 
maximum of X). 

The dual notion of “lower bound” should be easy to formulate. 

The existence upper bounds on a class of induced posets of (P, <) is connected 
with a criterion for asserting that maximal elements exist in P. 


Zorn’s Lemma: Suppose (P, <) is a poset for which any chain has an upper bound. 
Then any element of (P, <) lies below a maximal element. 


However, Zorn’s Lemma is not a Lemma or even a Theorem. It does not follow 
from the axioms of set theory (Zermelo-Fraenkel), nor does it contradict them. That 
is why we called it “a criterion for an assertion”. Using it can never produce a 
contradiction with formal set theory. But since its denial also cannot produce such a 
contradiction, one can apparently have it either way. The experienced mathematician, 
though not always eschewing its use, at least prudently reports each appeal to “Zorn’s 
Lemma’”.? 

Zorn’s Lemma is used in the next subsection. After that, it is used only very 


sparingly in this book. 


2.2.8 Well-Ordered Sets 


A well-ordered set is a special kind of totally ordered set. A poset (X, <) is said to 
possess the well-ordered property (or is said to be well-ordered) if and only if 


(a) (X <) is totally ordered. 
(b) every non-empty subset of X possesses a (necessarily unique) minimal member. 


3 Even an appropriate feeling of guilt is not discouraged. Who knows? Each indulgence in Zorni- 
fication might revisit some of you in another life. 
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It should be clear from this definition that every induced subposet of a well-ordered 
set is also well-ordered.* 

Any subset X of a well-ordered set A is called an initial segment if it possesses 
the property that if x e¢ X and y < x, then y € X. (In the next subsection, we shall 
meet sets with this property in the context of general posets. There they are called 
order ideals.) An example of an initial segment of a well-ordered set would be the 
set L(a) := {x € Alx < a}. (Note that x < a means x < a while x 4a.) 


Lemma 2.2.1 Suppose X is an initial segment of the well-ordered set (A, <). Sup- 
pose X # A. Then X has the form L(a), for some element a € A. 


Proof By hypothesis, the set A — X, being non-empty, possesses a minimal element 
a. All elements of L(a) belong to X by the minimality of a. Conversely, all elements 
of X are properly less than a by the definition of a. Thus X = L(a). 


Theorem 2.2.2 Any set A possesses a total ordering < with respect to which it is 
well-ordered. 


Proof This proof is a classic application of Zorn’s lemma. Let W denote the full 
collection of possible well-ordered posets (W, <w), where W is a subset of A. (Note 
that the same subset W may possess many possible well-orderings (W, <), each 
representing a distinct element of W.) If (W,, <1) and (W2, <2) are two elements 
of W, we write 

(Wi, <1) X (Wo, <2) 


if and only if (Wi, <1) is an initial segment of the well-ordered poset (W2, <2). 
(Specifically, this means that there is an element x € Wp such that Wj = {z € 
W2|z <2 x} and the relation <; is just <> restricted to W; x Wj.) Since an initial 
segment of an initial segment is an initial segment, the relation < is transitive and 
reflexive. It is clearly antisymmetric. In this way the collection of well-ordered sets 
W itself becomes a partially-ordered set with respect to the relation x. 

Now consider a chain C = {w, = (Wy, <))|A € J}, in the poset (W, x). 
Form the set-theoretic union Wc := Ue; W). Wc inherits a natural total ordering 
< derived from the <). If x and y are elements of Wc, then x € W) and y € W, 
for some indices A, w € I. Since C is a chain, one of these W’s is contained in 
the other, so we may assume (Wy, <)) X (W,,, <,,). Since both x and y lie in the 
totally ordered set W,,, we write x < y or y < x according asx <, y ory <;, Xx. 
In other words, in comparing two elements of Wc, we utilize the comparison that 
works in any of the posets (W), <)) or (Wy, <,) that may contain both of them. 
The comparisons will always be consistent since each poset is an initial segment of 
any poset above it in the chain. 

Next, we must show that the poset (Wc, <) is well-ordered. For that purpose, 
consider a non-empty subset S of Wc. Choose any x € S. Then x € Wy, for some 


4We shall see very soon that a well-ordered poset is simply a chain with the descending chain 
condition (see p. 44 and Corollary 2.3.6). 
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» € I. Now, (Wy, <)) is a well ordered set, and <) is the global relation < of 
We restricted to W). Since SM W) is non-empty, it contains a minimal element 
m. We claim that m is a minimal element of the induced poset (S, <). If this were 
false, one could find a second element mo € S, with mo < m but mo 4 m. But as 
mo € S C We, mo lies in some W,,. Now if W,, © Wy, then mp € SM Wy, against 
the minimality of m. Thus, as C is a chain, we must have that (W), <)) < (Wy, <,). 
But in that case, W) is an initial segment of W,, so that mo < m implies that mo is 
in W). We have just seen that this is impossible as that contradicts the minimality of 
min SM Wy. Thus no such mo exists, and m is the unique minimal element of S. 

At this point, (Wc, <) is a member of (WV, <) that is an upper bound of all mem- 
bers of the chain C. Since the chain C is arbitrary, we have achieved the conditions 
necessary for applying Zorn’s Lemma. Thus we may assume there exists a maximal 
element (Wi, <m) in the poset (WV, x). 

If x were a point of A — W,,, one could extend the relation <,, to ({x} U Win) x 
({x} U W,,) by declaring w <j, x forall w € {x}U W,,. Then W,, would become an 
initial segment of ({x} U Wn, <m), and the latter is again a well-ordered set. Thus 
one obtains (Win, <m) < ({x} U Wn, <m), against the maximality of (Win, <m) 
in (W, xX). So no such x exists, W,, = A, and we have obtained a well-ordering, 
(A, <m). 


It is time to examine the actual structure of a well-ordered set. First, any well- 
ordered set (A, <) inherits a simple partition into equivalence classes. Let us say that 
two elements of x and y of A are near, if and only if there are only a finite number 
of elements between x and y—that is, if x < y, the set {z © A|x < z < y} is finite, 
and if y < x then {z € Aly < z < x} is finite. 

Now, using only the fact that A is a totally-ordered set, we can conclude that 
the nearness relation between points, is transitive. For, given any three points, 
{a, b,c}, they possess some order—say a < b < c. Now if two of the intervals 
[a, b], [b, c], [a, c] are finite, then so is the third interval. It follows that the relation 
of nearness is transitive. It is obviously symmetric and reflexive, and so the “near- 
ness” relation is an equivalence relation on the elements of A. We let {A} denote 
the collection of nearness-equivalence classes of the well-ordered set A. 

A well-ordered set A may or may not contain a maximal element. For example, any 
well-ordered finite set contains a maximal element, while the infinite set of natural 
numbers N, under its natural ordering, is a well-ordered set with no maximal member. 
If A contains a maximal element m4, let Ama; denote the near-ness equivalence 
class containing m4. In that case, there are only finitely many elements between the 
minimal element of the set A,,g, and the maximal element m4, forcing Aja, to be 
a finite set in this case. Otherwise, if there is no maximal element, let Aj,g, be the 
empty set. Whether or not a maximal element exists in A, let A* denote the set of 
non-maximal members of A. 

The well-ordered property produces an injective mapping 


ag: A* >A 
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which takes each non-maximal element a to the least member o(a) of the set {x € 
A|x > a}. Wecalla the predecessor of a(a). Note thata < o(a) and that no elements 
properly lie between them. Similarly, for any n € N, if o” (a) is not maximal, there 
are exactly n — 1 elements lying properly between a and a” (a). Thus the elements 
of {o"(a)|o"—!(a) € A*} all belong to the same nearness class. 


Lemma 2.2.3 Let A) be any nearness equivalence class of the well-ordered set A. 


I. There is a least member ay), of the set A}. It has no predecessors. 

2. Conversely, is x is an element of A that has no predecessor, then x is the least 
member of the near-ness equivalence class that contains it. 

3. If Ay = Amax—that is, it contains an element that is maximal in A—then it is a 
finite set. 

4. If A. # Amax, then Ay = {a"(ay)|n € N}, where ay is the least member of the 
set A). In this case A, is an infinite countable set. 


Proof Part 1. If a, = a(x), then by the definition of o, x is near a), while being 
properly less than it. In that case, a, could not be the least member of its nearness 
class. 

Part 2. If x has no predecessor and lies in A), then x is near a), forcing x = 0” (ay) 
for some natural number n. But since x has no predecessor, n = 0, and so x = ay. 

Part 3. If m,4 were a maximal element of (A, <), then m,4 would be near the least 
member djqx of its nearness equivalence class Aj qx, forcing m4 = 0" (Amax), fora 
natural number n. Since m, is maximal in Amax, |Amax| = Nn. 

Part 4. Suppose A, 4 Ama. Then each element x of A) is non-maximal, and so 
has a successor a(x) that is distinct from it. Thus if a) is the least member of A), 
{a” (ay)|n € N} is an infinite set. Clearly, A C {o"(ay)|n EN} CA. 


This analysis of well-ordered sets has implications for cardinal numbers in general. 


Corollary 2.2.4 Let A be any infinite set. Then A is bijective with a set of the form 
N x B. In other words, any infinite cardinal number a has the form a = Xob, for 
some cardinal number b.> 


Proof By Theorem 2.2.2 one may impose a total ordering on the set A to produce a 
well-ordered set (A, <). By Lemma 2.2.3 the set Ajax of elements near a maximal 
element, is either finite or empty. Since A is infinite, the set A — Ajax is non-empty 
and has a partition 


A-Amax = J{Aald € I}. 


where J indexes the nearness classes distinct from Ajq,. Each class A) contains a 
unique minimal element a) which has no predecessor, and each element of A) can 


>The definition of cardinal number appears on p. 18, and Xo is defined to be the cardinality of the 
natural numbers in the paragraphs that follow. 
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be written as 0” (ay) for a unique natural number N. Thus we have a bijection 
A—Amax > Nx P7 


where P~ = {a)|A € J} is the set of all elements of A which have no predecessor 
and are not near a maximal element.. The mapping takes an arbitrary element x of 
A — Amax to the pair (n,a)) € N x P7~ where a) is the unique minimal element of 
the nearness-class containing x and x = a”(a)). 

Now one can adjoin the finite set Ajax to just one of the classes A, without 
changing the cardinality of that class. This produces an adjusted bijection A > 
N x P7, as desired. 


Corollary 2.2.5. Suppose A is an infinite set. Then A is bijective with N x A. For 
cardinal numbers, if a is any infinite cardinal number, then a = Xo - a. 


Proof It suffices to prove the statement in the language of cardinal numbers. By 
Corollary 2.2.4, we may write a = Xo - b, for come cardinal number b. Now 


No >a =&o- (No: db) = (Ro: Xo) - Db = Rob =a, 


by Theorem 1.4.4 and the associative law for multiplying cardinal numbers. 


The above Corollary is necessary for showing that any two bases of an inde- 
pendence theory (or matroid) have the same cardinality when they are infinite (see 
Sect. 2.6). That result in turn is ultimately utilized for further dimensional concepts, 
such as dimensions of vector spaces and transcendence degrees of field extensions. 


2.2.9 Order Ideals and Filters 


For this subsection fix a poset (P, <). An order ideal of P is an induced subposet 
(J, <), with this property: 


If y € J and x is an element of P with x < y, thenx € J. 


In other words, an order ideal is a subset J of P with the property that once some 
element belongs to J, then all elements of P below that element are also in J. Note 
that the empty subposet is an order ideal. 

The reader may check that the intersection of any family of order ideals is an order 
ideal. (Since order ideals are a species of induced posets, we are using “intersection” 
here in the sense of the previous Sect.2.1.2 on induced subposets.) Similarly, any 
set-theoretic union of order ideals is an order ideal. 

Then there is the dual notion. Suppose (F, <*) were an order ideal of the dual 
poset (P, <*). Then what sort of induced poset of (P, <) is F'? It is characterized 
by being a subset of (P, <) with this property: 
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Ifx € F and y is any element of P withx < y, theny € F. 


Any subset of P with this property is called a filter. 
There is an easy way to construct order ideals. Take any subset X of P. Then 
define 
Px := {y € Ply < x for some elementx € X}. 


Note that always X C Py. In fact the order ideal Py is “generated” by X in the 
sense that it is the intersection of all order ideals that contain X. Also, we understand 
that Py = @. 

In the particular case that X = {x} we write P, for Pr,}, and call P, the principal 
order ideal generated by x (or just a principal order ideal if x is left unspecified). 

Of course any order ideal has the form Py (all we have to do is set X = Px) but in 
general, we do not need all of the elements of X. For example if x; < x2 for x; € X, 
and if we set X’ := X — {x,}, then Py = Py. Thus if we throw out elements of X 
each of which is below an element left behind, the new set defines the same order 
ideal that X did. At first the student might get the idea that we could keep throwing 
out elements until we are left with an antichain. That is indeed true when X is a finite 
set, or more generally if every element of X is below some member of max(X, <). 
But otherwise it is generally false. 

There is another kind of order ideal defined by a subset X of poset P. We set 


\ Py := a P, ={y € P|y < x for every x € X}. 
xeXx 


Of course this order ideal may be empty. If it is non-empty we say that the set X has 
a lower bound in P—that is, there exists an element y € P which is below every 
element of the set X. 

The dual notion of the filter generated by X should be transparent. It is the set 
PX of elements of P which bound from above at least one element of X. It could be 
described as 

P* :={ye PIP, NX 4D. 


If X = {x}, then P* is called a principal filter. By duality, the intersection and union 
of any collection of filters is a filter. 
Then there is also the filter 


/\\ PX :={y € Plx < y forall x € X}, 


which may be thought of as the set of all “upper bounds” of the set X. Of course it 
may very well be the empty set. 

An induced subposet (X, <) of (P, <) is said to be convex if, whenever x; and 
x2 are elements of X with x; < x2 in (P, <), then in fact the entire interval [x, y]p 
is contained in X. Any intersection of convex induced subposets is convex. All order 
ideals, all filters, and all intersections and unions thereof are convex. 
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2.2.10 Antichains 


The reader is certainly familiar with many posets which are not totally ordered, such 
as the set P(X) of all subsets of a set X of cardinality at least two. Here the relation 
“<” is containment of sets. Again there are many structures that can be viewed as a 
collection of sets, and thus become a poset under this same containment relation: for 
example, subspaces of a vector space, subspaces of a point-line geometry, subgroups 
of a group, ideals in a ring and R-submodules of a given R-module and in general 
nearly any admissible subobject of an object admitting some specified set of algebraic 
properties. 

Two elements x and y are said to be incomparable if both of the statements x < y 
and y < x are false. A set of pairwise incomparable elements in a poset is called an 
antichain.® 

The set max(X, <) where (X, <) is an induced subposet of (P, <), is always an 
antichain. 


2.2.11 Products of Posets 


Suppose (P;, <) and (P2, <) are two posets.’ The product poset (P, x P2, <) is 
the poset whose elements are the elements of the Cartesian product P; x P2, where 
element (a1, a2) is declared to be less-than-or-equal to (b,, b2) if and only if 


ay < by and also a2 < do. 


It should be clear that this notion can be extended to any collection of posets 
{(Po, <)|o € I} to form a direct product of posets.. Its elements are the elements of 
the Cartesian product II,<; P;—that is, the functions f : J — U, where U is the 
disjoint union of the sets P, with the property that at any o in I, f always assumes a 


®In a great deal of the literature, sets of pairwise incomparable elements are called independent. 
Despite this convention, the term “independent” has such a wide usage in mathematics that little is 
served by employing it to indicate the property of belonging to what we have called an antichain. 
However, some coherent sense of the term “independence” is exposed in Sect.2.6 later in this 
chapter. 


7Usually authors feel that the two poset relations should always have distinguished notation—that 
is, one should write (P}, <1) and (P2, <2) instead of what we wrote. At times this can produce 
intimidating notation that would certainly finish off any sleepy students. Of course that precaution 
certainly seems to be necessary if the two underlying sets P| and P> are identical. But sometimes 
this is a little over-done. Since we already have posets denoted by pairs consisting of the set P; anda 
symbol “<”, the relation “<” is assumed to be the one operating on set P; and we have no ambiguity 
except possibly when the ground sets P; are equal. Of course in the case the two “ground-sets” 
are equal we do not hesitate for a moment to adorn the symbol “<” with further distinguishing 
emblems. This is exactly what we did in defining the dual poset. But even in the case that P} = P2 
one could say that in the notation, the relation ““<” is determined by the name P; of the set, rather 
then the actual set, so even then the “ordered pair’ notation makes everything clear. 
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value in P,. Then for functions f and g we have f < g if and only if f(o) < g(c) 
foreacha e€ I. 

Suppose now, each poset P, contains its own “zero element’, 0, less than or equal 
to all other elements of P,. We can define a direct sum of posets {(P>, <)} as the 
induced subposet of the direct product consisting only of the functions f € I] Po 

oel 
for which f (a) differs from 0, for only finitely many a. This poset is denoted al Py. 


oel 


2.2.12 Morphisms of Posets 


Let (P, <p) and (Q, <q) be two posets. A mapping f : P — Q is said to be order 
preserving (or is said to be a poset morphism) if and only if 


x <p yimplies f(x) <o f(y). 


Evidently, the composition g o f of two poset morphisms f : (P,<p) > (Q, <Q) 
andg : (Q,<g) — (R, <r) isalsoa poset morphism (P, <p) > (R, <p). Clearly 
the identity mapping |p : P > Pisamorphism. If f : P ~ Qisaposet morphism 
as above, then fo lp = 1go f = f. The student should be aware that if x and 
y are incomparable elements of P, it is still quite possible that f(x) <g f()) or 
fv) So f(x) in the poset (Q, <g). 

To clarify this point a bit further, using the morphism f : (P. <p) > (Q, <Q), 
let us form the image poset f(P) := (f(P), <7) whose elements are the images 
t(p), p € P, and we write f(x) < f(y) if and only if there is a pair of elements 
(x,y) e€ ; i (f(x)) x fF 'CfO)), the Cartesian product of the fibers above f (x) 
and f(y), such that x’ <p y’. Then the image poset (f(P), <y) is a subposet of 
(Q, <0). 

We say that the morphism / is full if and only if the image poset (f(P), <) is an 
induced poset of (Q, <q). Thus for a full morphism, we have a <g b in the image, 
if and only if there exist elements x and y in P such that x <p y and f(x) = a and 
fO) =6. 

A bijection f : P — Q is an isomorphism of posets if and only if it is also a full 
poset morphism. Thus if f is an isomorphism, we have f(x) <g f(y) if and only 
if x <p y. In this case the inverse mapping f~! : Q — P is also an isomorphism. 
Thus an isomorphism really amounts to changing the names of the elements and the 
name of the relation but otherwise does not change anything. Isomorphism of posets 
is clearly an equivalence relation and we call the corresponding equivalence classes 
isomorphism classes of posets. 

If the order preserving mapping f : P — Q is injective then the posets (P, <) 
and (f(P), <f) are isomorphic, and we say that f is an embedding of poset (P, <) 
into (Q, <). The following result, although not used anywhere in this book, at least 
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displays the fact that one is interested in cases in which the image poset of an 
embedding is not an induced subposet of the co-domain. 


Any poset (P, <) can be embedded in a totally ordered set.* 


2.2.13 Examples 


Example I (Examples of totally ordered sets) Isomorphism classes of these sets are 
called ordinal numbers. The most familiar examples are these: 


1. The natural numbers This is the set of non-negative integers, 
N := {0, 1, 2,3, 4,5, 6,...}, 
with the usual ordering 
0<1<2<3<4<5.--. 
Rather obviously, as an ordered set, it is isomorphic to the chain 
1<2<3<.:-:- 


or even 
k<k+1<k+2<k4+3<.---, 


k any integer, under the shift mapping n > n+k—1.° 

Recall that a poset (X, <) is said to possess the well-ordered property (or is said 
to be well-ordered) if and only if (1) (X, <) is totally ordered, and (ii) every non- 
empty subset of X possesses a (necessarily unique) minimal member. It should 
be clear from this definition that every induced subposet of a well-ordered poset 
is also well-ordered. The point here is that the natural numbers is a well-ordered 
poset under the usual ordering. This fundamental principle is responsible for some 
of the basic properties concerning greatest common divisors (see Chap. 3, p. 2). 

2. The system of integers 


Zi:={...< —2<-1<0<1<2<...}. 


8 Many books present an equivalent assertion “any poset has a linear extension”. The proof is an 
elementary induction for finite posets. For infinite posets it requires some grappling with Zorn’s 
Lemma and ordinal numbers. 
°This isomorphism explains why it is commonplace to do an induction proof with respect to the 
second of these examples beginning with | rather than the first, which begins with 0. 

In enumerative combinatorics, for example, the “natural numbers” N are defined to be all non- 
negative integers, not just the positive integers (see Enumerative Combinatorics, vol 1, p. 1. by R. 
Stanley) [1]. 
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One notes that any subset of Z which possesses a lower bound, forms a well- 
ordered induced subposet of Z. 

3. The system Q of the rational numbers with respect to the usual ordering—that is 
a/b > c/d if and only if ad > bd, an inequality of integers. 

4. The real number system R. 

5. Any induced subposet of a totally ordered set. We have already mentioned inter- 

vals of the real line. (Remark: the word “interval” here is used for the moment as it 
is used in Freshman College Algebra, open, closed, and half-open intervals such 
as (a, b] or [a, ©). In this context, the intervals of posets that we have defined 
earlier, become the closed intervals, [a, b], of the real line, with a consistency of 
notation. 
Here is an example: Consider the induced poset of the rational numbers (Q, <) 
consisting of those positive fractions less than or equal to 1/2 which (in lowest 
terms) have a denominator not exceeding the positive integer d in absolute value. 
For d = 7 this is the chain 


This is called a Farey series. A curiosity is that if £ and 5 are adjacent members 
from left to right in such a series, then bc — ad = 1! 


Example 2 (Examples of the classical locally finite (or finite) posets which are not 
chains) A poset (P, <) is said to be a finite poset if and only if it contains only finitely 
many elements—that is, |P| < oo. It is said to be locally finite if and only if every 
one of its intervals [x, y] is a finite poset. 


1. The Boolean poset B(X) of all finite subsets of a set X, with the containment 
relation (C) between subsets as the partial-ordering. (There is, of course, the 
power poset P(X), the collection of all subsets of X, as well as the cofinite 
poset which is the collection B*(X) of all subsets of X whose complement in 
X is finite—both collections being partially ordered by the inclusion relation. 
Of course, these two posets P(X), and B*(X) are not locally finite unless X is 
finite.) 

2. The divisor poset D of all positive integers N* under the divisor relation:—that 
is, we say a|b if and only if integer a divides integer b evenly—i.e. b/a is an 
integer. !° 


'OThere are variations on this theme: In an integral domain a non-unit a is said to be irreducible 
if and only if a = bc implies one of b or c is a unit. Let D be an integral domain in which each 
non-unit is a product of finitely many irreducible elements, and let U be its group of units. Let 
D*/U be the collection of all non-zero multiplicative cosets Ux. Then for any two such cosets, Ux 
and Uy, either every element of Ux divides every element of Uy or else no element of Ux divides 
any element of Uy. In the former case write Ux < Uy. Then (D*/U, <) is a poset. If D is a unique 
factorization domain, then, as above, (D*/U, <) is locally finite for it is again a product of chains 
(one factor in the product for each association class Up of irreducible elements). 

One might ask what this poset looks like when D is not a unique factorization domain. Must it 
be locally finite? It’s something to think about. 


36 2 Basic Combinatorial Principles of Algebra 


3. Posets of vector subspaces. The partially ordered set L<oo(V; q) of all finite- 
dimensional vector subspaces of a vector space V over a finite field of g elements 
is a locally finite poset. (There is a generalization of this: the poset L...(M) 
of all finitely generated submodules of a right R-module M and in particular 
the poset L<oo(V) of all finite-dimensional subspaces of a right-vector space V 
over some division ring. But of course these are not locally finite in general.)!! 

4. The partition set: 1,. Suppose X is a set of just n elements. Recall that a partition 
of X is a collection 7 := {Y,..., Y;} of non-empty subsets Y; whose join is 
X but which pairwise intersect at the empty set. The subsets Y; are called the 
components of the partition T. 
Suppose 7, := {Y;|i € I} is a partition of X and 7’ = {Z,|k € K} is a second 
partition. We say that partition 1’ refines partition x if and only if there exists a 
partition 7 = J; +--- J, of the index set, such that 


Y;:= U Ze. 


le Jj 


[We can state this another way: A partition 7 can be associated with a surjective 
function f; : X — I where the preimages of the points are the fibers partitioning 
X: the same being true of 7’ and an surjective function f, : X > K.We say that 
partition 7’ refines a partition x if and only if there exists a surjective mapping ¢ : 
K — I, such that f; = @ o f,/—that is, foreach x € X, fr(x) = (fp (*)).] 
For example {6}{4, 9}{2, 3, 8}{1, 5, 10}{9, 10}, with five components refines 
{1,5, 6,9, 10}, {2, 3,4, 8, 9} with just two components. 

Then IT, is the partially ordered set of all partitions of the n-set X under the 
refinement relation. !* 

5. The Poset of Finite Multisets: Suppose X is any non-empty set. A multiset is 
essentially a sort of inventory whose elements are drawn from X. For example: 
if X = {oranges, apples, and bananas} then m = {three oranges, two apples} is 
an inventory whose elements are multiple instances of elements from X. Letting 
O = oranges, A = apples, and B = bananas, one may represent the multiset m 
by the symbol 3- O +2-A-+0- B or even the sequence (3, 2, 0) (where the 
order of the coordinates corresponds to a total ordering of X). But both of these 
notations can become inadequate when X is an infinite set. The best way is to 
think of a multiset as a mapping. Precisely, a multiset is a mapping 


f[:x oN, 


from X into the set N of non-negative integers. 


ty Aigner’s book (see references), L<oo(V, q) is denoted £(00, q) in the case that V has countable 
dimension over the finite field of g elements. This makes sense when one’s plan is to relate the 
structure to certain types of generating functions (the q-series). But of course, it is a well-defined 
locally finite poset whatever the dimension of V. 

!2Tts cardinality |I,| is called the nth Bell number and will reappear in Chap.4 in the context of 
permutation characters. 
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A multiset f is dominated by a multiset g (written f < g) if and only if 
F(x) < g(x) for all x € X 


(where the “<” in the presented assertion reflects the standard total ordering of 
the integers). 
The collection hom(X, N) of all multisets of a set X forms a partially ordered set 
under the dominance relation. Since the multisets are actually mappings from 
X to N the dominance relation is exactly that used in comparing mappings in 
products. We are saying that the definition of the poset of multisets shows us 
that 

(hom(X,N), <) = [N, (2.3) 

xeX 


—that is a product in which each “factor” P, in the definition of product of 
posets is the constant poset (N, <) of non-negative integers. 
The multiset f is said to be a finite multiset of magnitude | f | if and only if 


f(x) > 0 for only finitely many values of x, and (2.4) 
v= > £2) (2.5) 
xeEX, f(x)>0 


where the sum in the second equation is understood to be the integer 0 when the 
range of summation is empty (i.e. f(x) = 0 for all x € X). 

Thus in the example concerning apples, bananas, and oranges above, the multiset 
m is finite of magnitude 3 + 2 = 5. 

In this way the collection M <o0(X) of all finite multisets forms an induced poset 
of (hom(X, N), <). Next one observes that a mapping f : X — N is a finite 
multiset if and only if f(x) = 0 for all but a finite number of instances of x € X. 
This means Eq. (2.3) has a companion with the product replaced by a sum: 


Meo (X) = al N. (2.6) 
xEXx 


The above list of examples shall continue in the subsequent sections. 


2.2.14 Closure Operators 


We need a few other definitions related to poset mappings. 
An order preserving mapping f : P — P is said to be monotone non-decreasing 
if p < f(p) for all elements p of P. 
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A closure operator is a monotone non-decreasing poset homomorphism Tt : P + 
P which is idempotent. In other words, T possesses the following three properties: 


(i) (Monotonicity) If x € P, then x < T(x) 
(ii) (Homomorphism) If x < y, then T(x) < T(y). 
(ii) (Idempotence) T(x) = T(7(x)) for all x € P. 


We call the images of 7 the closed elements of P. 
There are many contexts in which closure operators arise, and we list a few. 


. The ordinary topological closure in the poset of subsets of a topological space. 

. The mapping which takes a subset of a group (ring or R-module) to the subgroup 
(subring or submodule, resp.) generated by that set in the poset of all subsets of 
a group (ring or R-module). 

3. The mapping which takes a set of points to the subspace which they generate in 

a point-line geometry (P, £).° 


Ne 


2.2.15 Closure Operators and Galois Connections 


One interesting context in which closure operators arise are Galois connections. Let 
(P, <) and (Q, <) be posets. A mapping f : P — Q is said to be order reversing 
if and only x < y implies f(x) > f(y). 


Example 3 This example displays a common context that produces order reversing 
mappings between posets. Suppose X = U,ge; Ag, aunion of non-empty sets indexed 
by 7. Now there is a natural mapping among power posets: 


a: P(X)> PU), 
which takes each subset Y of X, to 
a(Y) := {0 € 1|Y C Ag}. 


For example, if Y is contained in no A,, then a(Y) = %. Now if Yj C Yo C X we 
see that a(Y2) C a(Y;)—that is, as the Y; get larger, there are generally fewer A, 
that contain them. Thus the mapping a is order-reversing. 


Let (P, <) and (Q, <) be posets. A Galois connection (P, Q, a, (3) is a pair 
of order-reversing mappings a : P — Q and {@ : Q — P, such that the two 
compositions G co a: P ~ Pandao£{: Q > Q are both monotone non- 
decreasing. 


13 Here, the set of lines, £, is simply a family of subsets of the set of points, P. A subspace is a set 
S of points, with the property that if a line L € L£ contains at least two points of S, then L C S. 
Thus the empty set and the set P are subspaces. From the definition, the intersection over any 
family of subspaces, is a subspace. The subspace generated by a set of points X is defined to be the 
intersection of all subspaces which contain X. 
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Galois connections arise in several contexts, especially when some algebraic 
object acts on something. 


Example 4 (Groups acting on sets) Suppose that G is a group of bijections of a set X 
into itself. The group operation is composition of bijections and the identity element 
is the identity mapping ly : X — X defined by x + x forall x € X. For g € G, 
the corresponding bijection or permutation is described as an exponential operator 
by x x9 forall x € X. 

Consider the two posets P(X) and P(G), the power posets of X and G. Define 


Cg : P(X) > P(G) 


by setting Cg(U) := {g € G| glu) = v9 = u for allu € U}, for each subset U of 
X. This mapping is order reversing: if U C V, then Cg(U) > Cg(V). Conversely, 
if H is a subset of G, set Fix(H) := {x € X|x = x! for allh € Hy}. 

Then Fix is an order-reversing mapping P(G) — P(X). Therefore, (P(X), 
P(G), Cg, Fix) becomes a Galois connection upon verifying that the compositions 
Cg o Fix : P(G) — P(G) and Fix o Cg : P(X) — P(X) are monotone non- 
decreasing. 


Example 5 In the above example, the set X might have extra structure that is pre- 
served by G. For example X might itself be a group, ring, field, or a vector space, and 
G is a group of automorphisms of X. This situation arises in the classical Galois the- 
ory studied in Chap. 11, where X is a field, and where G is a group of automorphisms 
fixing a subfield Y of X. 


Example 6 A ring R may act as a ring of endomorphisms of an abelian group A 
with the (multiplicative) identity element inducing the identity mapping on A. One 
can then form a Galois connection (P(A), P(R), Cr, Fix) with 


Cr:={reR|u’ =u forallu € UV}, 


Fix(S) := {a € A| a° =a foralls € S} 
for all U € P(A) and for all S € P(R). 


Example 7 Another example arises in algebraic geometry. We say that a polynomial 


p(X1,.--,Xp) vanishes at a vector v = (a,,..., dy) in the vector space F of 
n-tuples over the field F if and only if p(v) := p(a,...,d,) = 0 € F. Let 
(P, <) be the poset of ideals in the polynomial ring F'[x1,..., x,] with respect to 


the containment relation and for each ideal J let a(/) be the set of vectors in F” 
at which each polynomial of J vanishes. Let (Q, <) be the poset of all subsets of 
F) with respect to containment. For any subset X of F, let 3(X) be the set of all 
polynomials which vanish simultaneously at every vector in X. Then (P, Q, a, (3) 
is a Galois connection. 


40 2 Basic Combinatorial Principles of Algebra 


The Corollary following the Lemma below shows how closure operators can arise 
from Galois connections. 


Lemma 2.2.6 Suppose (P, Q, a, 3) is a Galois connection. Then for any elements 
p and q of P and Q respectively 


a(B(a(p))) = a(p) and B(a(B(q))) = B@) 


Proof For p € P, p < {(a(p)) since 8 o a is monotone non-decreasing. Thus 
a(G(a(p))) < a(p) since a is order reversing. 
On the other hand 


a(p) < a(G(a(p))) = (ao B)(a(p)) 


since (@o/3) is monotone non-decreasing. By antisymmetry, we have the first equation 
of the statement of the Lemma. 

The second statement then follows from the symmetry of the definition of Galois 
connection—that is (P, Q, a, 3) is a Galois connection if and only if (Q, P, @, a) 
is. 


Corollary 2.2.7 If (P, Q, a, 3) is a Galois connection, then T := 30a is a closure 
operator on (P, <). 


Proof Immediate upon taking the G-image of both sides of the first equation of the 
preceding Lemma. 


Example 8 Once again consider the order reversing mapping a : P(X) > PJ) of 
Example 3, where X was a union of non-empty sets A, indexed by a parameter a 
ranging over a set J. For every subset Y of X, a(Y) was the set of those o for which 
Y CAG. 

There is another mapping @ : P(!) > P(X) defined as follows: If J C J set 
BJ) = Oces Ac, With the understanding that if J = @, then G(J) = X. Then ( is 
easily seen to be an order-reversing mapping between the indicated power posets. 

Now the mapping T = Boa: P(X) — P(X) takes each subset Y of X to the 
intersection of all of the A, which contain it (with the convention that an intersections 
over an empty family of sets denotes the set X itself). Thus 7 is a nice closure operator. 

Similarly, p = ao 3: P(1) > PJ) takes each subset J to the set 


p(J) = (0 € TAg 2 MjesAj). 
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2.3 Chain Conditions 


2.3.1 Saturated Chains 


Recall that a chain C of poset (P, <) is simply a (necessarily induced) subposet 
(C, <) which is totally ordered. In case C is a finite set, |C| — 1 is called the length 
of the chain. 

We say that a chain C2 refines a chain C, if and only if Cj © C2. The chains of 
a poset (P, <) themselves form a partially ordered set under the inclusion relation 
(the dual of the refinement relation) which we denote (ch(P, <), C). 

Suppose now that 

CooCi c-::: 


is an ascending chain in the poset (ch(P, <), C). Then the set-theoretic union [J C; 
is easily seen to be totally ordered and so is an upper bound in (ch(P, <), C) of this 
chain. An easy application of Zorn’s Lemma then shows that every element of the 
poset (ch(P, <), C) lies below a maximal member. These maximal chains are called 
unrefinable chains. Thus 


Theorem 2.3.1 Jf C is a chain in any poset (P,<), then C is contained in an 
unrefinable chain of (P, <). 


Of course we can restrict this in special ways to an interval. In a poset (P, <), a 
chain from x to y is achain (C, <) with x as its unique minimal element and y as it 
unique maximal element. Thus x < y and {x, y} CC C [x, y]. 

The collection of all chains of (P,<) from x to y itself becomes a poset 
(ch[x, y], C) under the containment (or “corefinement’) relation. A chain from x to 
y is said to be saturated if it is a maximal element of (ch[x, y], c),™ 

We can apply Theorem 2.3.1 for P = [x, y] and the chains C in it that do contain 
{x, y} to obtain: 


Corollary 2.3.2 [f C is a chain from x to y in a poset (P, <), then there exists a 
saturated chain C' from x to y which refines C. In particular, given an interval [x, y] 
of a poset (P, <), there exists an unrefinable chain in P from x to y. 


2.3.2 Algebraic Intervals and the Height Function 


Let (P, <) be any poset. An interval [x, y] of (P, <) is said to be algebraic (or 
of finite height) if and only if there exists a saturated chain from x to y of finite 


14 Note that (ch[x, y], C) is not quite the same as (ch([x, y], <) since the latter may contain chains 
which, although lying in the interval [x, y], do not contain x or y. 
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length.!> Note that to assert that [a, b] is an algebraic interval, does not preclude 
the simultaneous existence of infinite chains from a to b. The height of an algebraic 
interval [x, y] is then defined to be the minimal length of a saturated chain from x to 
16 
y. 
If [a, b] is an algebraic interval, its height is denoted h(a, b), and is always a 
non-negative integer. We denote the collection of all algebraic intervals of (P, <) by 


the symbol Ap. 


Proposition 2.3.3. The following hold: 


(i) If [a, b] and [b, c] are algebraic intervals of poset (P, <), then [a, c] is also an 
algebraic interval. 

(ii) The height function h : Ap — N from the non-empty algebraic intervals of 
(P, <) to the non-negative integers, satisfies this property: If |a, b| and [b, c] 
are algebraic intervals, then 


h(a,c) < h(a, b) + h(b.c) 


2.3.3 The Ascending and Descending Chain Conditions 
in Arbitrary Posets 


Let (P, <) be any partially ordered set. We say that P satisfies the ascending chain 
condition or ACC if and only if every ascending chain of elements of P stabilizes 
after a finite number of steps—that is, for any chain 


Pl S p2 Ss Px" 
there exists a positive integer NV, such that 
PN = PN+1 =::: = Pr, for all integers k greater than NV. 


Put another way, (P, <) has the ACC if and only if every properly ascending chain 


'This adjective “algebraic” does not enjoy uniform usage. In Universal Algebras, elements which 
are the join of finitely many atoms are called algebraic elements (perhaps by analogy with the theory 
of field extensions). Here we are applying the adjective to an interval, rather than an element of a 
poset. 

'6Since the adjective “algebraic” entails the existence of a finite unrefinable chain, the height of a 
algebraic interval is always a natural number. The term “height” is used here instead of “length” 
which is appropriate when all unrefineable chains have the same length, as in the semimodular 
lower semilattices that appear in the Jordan-Hélder Theorem. 
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terminates after some finite number of steps. Finally, it could even be put a third way: 
(P, <) satisfies the ascending chain condition if and only if there is no countable 
sequence {Pn}ro , With Ppp < Pn+1 for all natural numbers n. 


Lemma 2.3.4 For any poset P = (P, <) the following assertions are equivalent: 


(i) (P, <) satisfies the ascending chain condition (ACC). 
(ii) (The Maximum Condition) Every non-empty subset X of P the induced subposet 
(X, <) contains a maximal member. 

(iii) (The Second form of the Maximum Condition) In particular, for every induced 
poset X of (P, <), and every element x € X, there exists an element y € X 
such that x is bounded by y and y is maximal in (X, <). (To just make sure that 
we understand this: for every x € X there exists an element y € X such that 


(a) x<y. 
(b) Ifu€ X andy <u thenu=y.) 


Remark The student is reminded: to say that “x is a maximal member of a subset X 
of a poset P” simply means x is an element of X which is not properly less than any 
other member of X. 


Proof of Lemma 2.3.4: 

1. (The ACC implies the Maximum Condition). Let X be any nonempty subset of 
P. By way of contradiction, assume that X contains no maximal member. Choose 
x, € X. Since x, is not maximal in X, there exists an element x2 € X with xj < x2. 
Suppose now, that we have been able to extend the chain x1 < x2 tox] <--+ < Xp. 
Since x, is not maximal in X, there exists an element x,41; € X,such that x, < x41. 
Thus, by mathematical induction, for every positive integer n, the chain x} <--- < 
Xn, can be extended to xj < +--+ < xX, < X,+41. Taking the union of these extensions 
one obtains an infinite properly ascending chain 


Xp, <X) <XZ<°°°, 


contrary to the assumption of ACC.!7 

2. (The first version of the Maximum Principle implies the second.) Now assume 
only the first version of the maximum condition. Take a subset X and an element 
x € X. Then set X’ = XM P* where P* := {z € P|x < z} is the principal filter 


'7 The graduate student has probably encountered arguments like this many times, where a sequence 
with certain properties is said to exist because after the first n members of the sequence are con- 
structed, it is always possible to choose a suitable n + 1-st member. This has an uncomfortable feel 
to it, for the sequence alleged to exist must exemplify infinitely many of these choices—at least 
invoking the Axiom of Choice in choosing the x;. But in a sense it appears worse. The sets are not 
just sitting there as if we had prescribed non-empty sets of socks in closets lined up in an infinite 
hallway (the traditional folk-way model for the Axiom of Choice). Here, it as if each new closet 
was being defined by our choice of sock in a previous closet, so that it is really a statement about 
the existence of infinite paths in trees having no vertex of degree one. All we can feebly tell you is 
that it is basically equivalent to the Axiom of Choice. 
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generated by x in the induced poset (X, <). Then X’ is non-empty (it contains x) 
and so we obtain an element y maximal in X’ from the first version of the maximum 
condition. Now if y were not a maximal member of X there would exist an element 
y’ € X with y < y’. But in that case x < y’ so y’ € XM P* = X’. But that would 
contradict y being maximal in X’. Thus y is in in fact maximal in X and dominates 
x as desired. 

3. (The second version of the Maximum Principle implies the ACC.) Assume the 
second version of the Maximum Principle. Suppose the ACC failed. Then there must 
exist an infinite properly ascending chain x9 < x <---. Setting X = {x;|i € N}, 
and x = xg, we see there is no maximal member of X dominating x, contrary to the 
statement of the second version of the Maximum Principle. 

Of course, by replacing the poset (P, <) by its dual P* := (P, >) and applying 
the above, we have the dual development: 

We say a poset (P, <) possesses the descending chain condition or DCC if and 
only if every descending chain 


Pi=p2=-':=pjyEeP, 


stabilizes after a finite number of steps. That is, there exists a positive integer N such 
that py = py+1 =--- pn-+x for all positive integers k. 

We say that x is a minimal member of the subset X of P if and only if y > x for 
y € X implies x = y. A poset (P, <) is said to possess the minimum condition if 
and only if 


(Minimum Condition) Every nonempty subset of elements of the poset P = 
(P, <) contains a minimal member. 


Then we have: 


Lemma 2.3.5 A poset P = (P, <) has the descending condition (DCC) 


(i) ifand only if it satisfies the minimum condition or 

(ii) if and only if it satisfies this version of the minimum condition: for any induced 
poset X any x € X, there exists an element y minimal in X which is bounded 
above by x. 


Proof: Just the dual statement of Lemma 2.3.4. 


Corollary 2.3.6 Every non-empty totally ordered poset with the descending chain 
condition (DCC), is a well-ordered set. 


Proof Let (P, <) be a non-empty totally-ordered poset with the DCC. By Lemma 
2.3.5, the minimum condition holds. The latter implies that any non-empty subset 
X contains a minimal member, say m. Since (X, <) is totally ordered, m is a global 
minimum of (X, <). Thus (P, <) is well-ordered.!® 


'8This conclusion reveals the incipient presence of the Axiom of Choice/Zorn’s Lemma in the 
argument of the first paragraph of the proof of Lemma 2.3.4. 
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Corollary 2.3.7 (The chain conditions are hereditary) If X is any induced poset of 
(P, <), and P satisfies the ACC (or DCC), then X also satisfies the ACC (or DCC, 
respectively). 


Proof This is not really a Corollary at all. It follows from the definition of “induced 
poset” and the chain-definitions directly. Any chain in the induced poset is a fortiori a 
chain of its ambient parent. We mention it only to have a signpost for future reference. 


In the next section, Theorem 2.4.2 will have a surprising consequence for lower 
semilattices with the DCC. 
Any poset satisfying both the ACC and the DCC also satisfies 


(FC) Any unrefinable chain of P has finite length. 


This is because patently one of the two chain conditions is violated by an infinite 
unrefinable chain. Conversely, if a poset P satisfies condition (FC) then there can be 
no properly ascending or descending chain of infinite length since by Theorem 2.3.1 
it would then lie in a saturated chain which was also infinite against (FC). Thus 


Lemma 2.3.8 The condition (FC) is equivalent to the assumption of both DCC and 
ACC. 


2.4 Posets with Meets and/or Joins 


2.4.1 Meets and Joins 


Let W be any subset of a poset (P, <). The join of the elements of W is an element 
v in (P, <) with these properties: 


1. w < v forall we W. 
2. If v’ is an element of (P, <) such that w < v’ forall w € W, then v < v’. 


Similarly there is the dual notion: the meet of the elements of W in P would be 
an element m in P such that 


1. m < w forall w € W. 
2. If m’ is an element of (P, <) such that m’ < w for all w € W, thenm’ < m. 


Of course, P may or may not possess a meet or a join of the elements of W. But 
one thing is certain: if the meet exists it is unique; if the join exists, it is unique. 
Because of this uniqueness we can give these elements names. We write /\ p(W) (or 
just /\(W) if the ambient poset P is understood) for the meet of the elements of W 
in P (if it exists). Similarly, we write \/ p(W) (or \/(W)) for the join in P of all of 
the elements of W (if that exists). 
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In the case that W is the set {a, b}, we render the meet and join of a andb bya Ab 
and a V b, respectively. The reader may verify 


aNb=baa (2.7) 
an(bAc)=(anb)Ac (2.8) 
(aANb)V (adc) <adA(bVc) (2.9) 


and the three dual statements when the indicated meets and joins exist. 

We say that (P, <) is a lower semilattice if it is “meet-closed’”—that is, the meet 
of any two of its elements exists. In this case it follows that the meet of any finite 
subset {a1, a2, ..., Gn} exists (see Exercise (9) in Sect.2.7.2). We denote this meet 
by aj A a2 A+++ A dy. We then have 


Lemma 2.4.1 Suppose P is a lower semilattice, containing elements x, a\,a2,..., 
an € P such that x < a; fori =1,2,...,n. Then 


xX <a, Aa2/A-:::-Ady. 


Dually we can define an upper semilattice (it is “join closed”). 

A lattice is a poset (P,<) that is both a lower semilattice and an upper 
semilattice—that is, the meet and join of any two of its elements both exist in P. 
Thus, a lattice is a self-dual concept: If P = (P, <) is a lattice, then so is its dual 
poset P* = (P, <*). 

If every non-empty subset U of P has a meet (join) we say that arbitrary meets 
(joins) exists. If both arbitrary meets and joins exist we say that P is a complete 
lattice. 


Example 9 Here are some familiar lattices: 


1. Any totally ordered poset is a lattice. The meet of a finite set is its minimal 
member; its join is its maximal member. Considering the open real interval (0, 1) 
with its induced total ordering from the real number system, it is easy to see 


(a) that there are lattices with no “zero” or “one”, 
(b) that there can be (infinite) subsets of a lattice with no lower bound or no 
upper bound. 


2. The power set P(X) is the poset of all subsets of a set X under the containment 
relation. It is a lattice with the intersection of two sets being their meet, and the 
union of two sets being their join. This lattice is self dual and is a complete 
lattice, meaning that any subset of elements of P(X), whether infinite or not, has 
a least upper bound and greatest lower bound—i.e. a meet and a join. Of course 
the lattice has X as its “one” and the empty set as its “zero”. 
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2.4.2 Lower Semilattices with the Descending Chain Condition 


The proof of the following theorem is really due to Richard Stanley, who presented 
it for the case that P is finite [36, Proposition 3.3.1, p. 103]. 


Theorem 2.4.2 Suppose P is a meet-closed poset (that is, a lower semilattice) with 
the DCC condition. 
Then the following assertions hold: 


(i) If X is a meet-closed induced poset of P, then X contains a unique minimal 
member. 
(ii) Suppose, for some subset X of P, the filter of upper bounds 


je := {y € P\x < yforallx € X} 


is non-empty. Then the join \/ (X) exists in P. 
In particular, if (P, <) possesses a one-element i, then for any arbitrary subset 
X of P, there exists a join \/ (X). Put more succinctly, if 1 exists, unrestricted 
Joins exist. 

(iii) For any non-empty subset X of P, the universal meet [\(X) exists and is 
expressible as a meet of a finite number of elements of X. 

(iv) The poset P contains a 0. 

(v) If P contains a i, then P isa complete lattice. 


Proof (i) The set of minimal elements of X is non-empty (Lemma 2.3.5 of the section 
on chain conditions.) Suppose there were at least two distinct minimal members of 
X, say x; and x2. Then x; A x2 is also a member of X by the meet-closed hypothesis. 
But by minimality of each x;, one has 


X= Xi A Xo = 2. 


Since any two minimal members of the set X are now equal (and the set of them is 
non-empty) there exists a unique minimal member. The proof of Part (i) is complete. 

(ii) Let X be any subset of P for which the filter (| P* is non-empty. One observes 
that /\ P* is ameet-closed induced poset and so by part 1, contains a unique minimal 
member j(X). Then, by definition, j(X) is the join \/ (X). If i € P then of couse 
(\ P* is non-empty for all subsets X and the result follows. 

(iii) Let W(X) be the collection of all elements of P which can be expressed as 
a meet of finitely many elements of X, viewed as an induced poset. Then W(X) is 
meet-closed, and so has a unique minimal member xo by (i). In particular x9 < x for 
all x € X. Now suppose z < x for all x ¢ X. Then by Lemma2.4.1, z is less than 
or equal to any finite meet of elements of X, and so is less than or equal to xo. Thus 
xo = /\(X), by definition of a global meet. 

(iv) By (i), P contains minimal non-zero elements (often called “atoms”’). Suppose 
m is such an atom. If m < x forall x € P then x would be the element 6, against our 


48 2 Basic Combinatorial Principles of Algebra 


definition of “minimal element”. Thus there exists an element y € P such that y is 
not greater than or equal tom. Then y A m (which exists by meet-closure) is strictly 
less than m. Since m is an atom, we have y Am = Oe P. 

(v) If i exists, P enjoys both unrestricted joins by (ii). But by (iii), unrestricted 
meets exist, and so now P is a complete lattice. 

The proof is complete. 


Example 10 The following posets are all lower semilattices with the DCC: 


1. Nt = {1 < 2 < ---} of all positive integers with the usual ordering. This is a 
chain. 
2. D, the positive integers under the partial ordering of “dividing”—that is, a < b 
if and only if integer a divides the integer b. 
. B(X), the poset of all finite subsets of a set X. 
. Loo(V), the poset of finite-dimensional subspaces of a vector space V. 
5. Meo(J), the finite multisets over a set J. 


Kw 


It follows that arbitrary meets exist and a 0 exists. 


Remark Of course, we could also adjoin a global maximum i to each of these 
examples and obtain complete lattices in each case. 


2.4.3 Lower Semilattices with both Chain Conditions 


Recall from Sect. 2.4.3, Lemma 2.3.8, that a poset has both the ACC and the DCC if 
and only if it possesses condition 


(FC) Every proper chain in P has finite length. 


An observation is that if P satisfies (FC), then so does its dual P*. Similarly, if 
P has finite height, then so does its dual. This is trivial. It depends only on the fact 
that rewriting a finite saturated chain in descending order produces a saturated chain 
of P*. 

We obtain at once 


Lemma 2.4.3 Suppose P is a lower semilattice satisfying (FC). 


(i) P contains a 0. If P contains a i then P isa complete lattice. 

(ii) Every element of P — {i} is bounded by a maximal member of P — {i} (that is, 
an element of max(P)). 

(iii) Every element of P — {0} is above a minimal element of P — ) (that is, an 
element of min(P)). 

(iv) The meet of all maximal elements of P—that is (\ (max(P)) (called the Frattini 
element or radical of (P., <)) exists and is the meet of just finitely many elements 
of max(P). 
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(v) If le P, the join of all atoms, \/ (min(P)) (called the socle of P and denoted 
soc(P)) exists and is the join of just finitely many atoms. 


Proof Parts (i) and (iv) follow from Theorem 2.4.2. Parts (ii) and (iii) are imme- 
diate consequences of the hypothesis (FC), and Part (v) follows from Parts (1) and 


(iii). 


Remark Hopefully the reader will notice that the hypothesis that P contained the 
element 1 played a role—even took a bow—in Parts (i) and (v) of the above Lemma. 
Is this necessary? After all, we have both the DCC and the ACC. Well, there is an 
asymmetry in the hypotheses. P is a lower semilattice, but not an upper semilattice 
(though this symmetry is completely restored once i exists in P, because then we 
have arbitrary joins). 


Example 11 Consider any one of the posets D, B(X), L<oo(V). These are ranked 
posets, with the rank of an element being (1) the number of prime factors, (2) the 
cardinality of a set, or (3) the vector-space dimension of a subspace, respectively. 
Select a positive integer r and consider the induced poset tr, (P) of all elements of 
rank at most r in P where P = D, B(X), Lew(V). Then tr,(P) is still a lower 
semilattice with both the DCC and the ACC. But it has no 1. 


2.5 The Jordan-Hélder Theory 


In the previous two sections we defined the notions of “algebraic interval’, “height 
of an algebraic interval” and the meet-closed posets which we called “lower semi- 
lattices”. They shall be used with their previous meanings without further comment. 

This section concerns a basic theorem that emerges when a certain property, that of 
semimodularity, is imposed on lower semilattices. Many important algebraic objects 
give rise to lower semilattices which are semimodular (for example the posets of 
subnormal subgroups of a finite group, or the submodule poset of an R-module) and 
each enjoys its own “Jordan-Holder Theorem’’—it always being understood there is 
a general form of this theorem. It is in fact a very simple theorem about extending 
“semimodular functions”!? on the set of covers of a semimodular lower semilattice 
to an interval measure on that semilattice. It sounds like a mouthful, but it is really 
quite simple. 

Fix a poset (P, <). An unrefinable chain of length one is called a cover and is 
denoted by (a, b) (which is almost the name of its interval [a, b]—altered to indicate 
that we are talking about a cover). Thus (a, b) is acover if and only ifa < b and there 
is no element c in P witha < c < b—i.e. a is a maximal element in the induced 


'°The prefix “semi-” is justified for several reasons. The term “modular function” has quite another 
meaning as a certain type of meromorphic function of a complex variable. Secondly the function in 
question is defined in the context of a semimodular lower semilattice. So why not put in the “semi”? 
We do not guarantee that every term coined in this book has been used before. 
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poset of all elements properly below b (the principal order ideal P, minus {b}). The 
collection of all covers of (P, <) is denoted Coup. Note that Covp is a subset of Ap, 
the collection of all algebraic intervals of (P, <). 


2.5.1 Lower Semi-lattices and Semimodularity 


A lower semilattice (P, <) is said to be semimodular if, whenever (x1, b) and (x2, b) 
are both covers with x2 € x2, then both (x1 Ax2, x1) and (x1 Ax2, x2) are also covers. 


Lemma 2.5.1 Suppose (P, <) is a semimodular lower semilattice. Suppose [x, a] 
is algebraic and (b, a) is a cover, where x < b. Then [x, b] is algebraic. 


Proof Since [x,a] is algebraic, there exists a finite unrefinable chain from x to a, 
say A = (xX = a0, 41,..., 4, = a). Clearly each interval [x, a;] is algebraic. See 
Fig. 2.1. 

By hypothesis, x < b and so there is a largest subscript 7 such that a; < b. Clearly 
i<n.Ifi =n-—1thena,_; = bso [x, bd] is algebraic by the previous paragraph. 
Thus we may assume that b 4 ay,_ 1. (See Fig. 2.1.) Since both (a,_1, a) and (b, a) 
are covers, then by semimodularity, both (dj; A b, d,—1) and (an—1 A b, b) are 
covers. Continuing in this way we obtain that (ag A b, ag) and (ax A b, ag—1 A b) 
are covers for all k larger than 7. Finally, as a; < b, this previous statement yields 
aj4+, \ b = a; and (by semimodularity) (a;, aj+2  b) must be a cover). Note that 


(x = do,..., 4), 4342 Ab, aj43 Ab,...,an—1 Ab, bd) 


is an finite unrefinable chain since its successive pairs are covers. This makes [x, b] 
algebraic, completing the proof. 


Fig. 2.1 The poset showing An =a 
[x, b] is algebraic. The cov/” \ cov 
symbol “cov” on a depicted i b 
interval indicates that it is a oe a a YY 
cover : . 
cov| 
a2 
covs, 
ay 
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2.5.2 Interval Measures on Posets 


Let M be a commutative monoid. This means M possesses a binary operation, say 
“Se,” which is commutative and associative, and that M contains an identity element, 
say “e”, such thatm =e xm=mv*eforallme M. 

There is one commutative monoid that plays an important role in the applications 
of our main theorem and that is the monoid M (X) of all finite multisets over a set X. 
We have met this object before in the guise of a locally finite poset M <o0(X) (see 
the last item under Example 2 of this chapter). We have seen that any finite multiset 
over X can be represented as a function f : X — N from X to the non-negative 
integers N whose “support” is finite—i.e. the function achieves a non-zero value 
in only finitely many instances as x wanders over X. Now, if f and g are two such 
functions, we may let “ f+” denote the function that takes x € X to the non-negative 
integer f(x) + g(x), the sum of two integers. Clearly f + g has finite support, and so 
(M <o0(X), +) (under this definition of “plus”) becomes a commutative semigroup. 
But as the constant function Oy : X — {0} is an identity element with respect to this 
operation, M(X) := (M <.o(X), +) is actually a commutative monoid. We call this 
the commutative monoid of finite multisets over X. 

An interval measure ,1 of a poset (P, <) is a mapping 4: Ap — M from the set 
of algebraic intervals of P into a commutative monoid (M, *) with identity element 
e such that 


La, a) =e forallae P. (2.10) 
L(a, b) * (b,c) = (a, c) whenever [a, b] and [b,c] arein Ap = (2.11) 


[Notice that we have found it convenient to write ju(a, b) for p([a, b]).] 
Here are some examples of interval measures on posets: 


Example 12 Let M be the multiplicative monoid of positive integers. Let (P, <) be 
the set of positive integers and write x < y if x divides y evenly. Then every interval 
of (P, <) is algebraic. Define j1 by setting (a, b) := b/a for every interval (a, b). 


Example 13 Let (P, <) be as in Example 14, but now let M be the additive monoid 
of all non-negative integers. Now if we set ju(a, b); = the total number of prime 
divisors of b/a, then jz is a measure. 


Example 14 Let (P, <) be the poset of all finite-dimensional subspaces of some 
(possibly infinite dimensional) vector space V, where “<” is the relation of “con- 
tained in”. 


(i) Let M be the additive monoid of all non-negative integers. If we define 
LAA, B) := dim(B/A), then jz is a measure on (P, <). 

(ii) If M is the multiplicative group {1, —1}, then setting (A, B) := (—1)4™4/) 
for every algebraic interval (A, B) also defines a measure. 
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2.5.3 The Jordan-Hélder Theorem 


Now we can prove 


Theorem 2.5.2 (The Jordan-H6lder Theorem) Let (P, <) be a semimodular lower 
semilattice. Suppose p1 : Coup —> (M, +) is amapping from the set of covers of P 
to a commutative monoid (M, +), and suppose this mapping is “semimodular” in 
the sense that 


Lb, y) = ua A b, a) whenever (a, y) and (b, y) are distinct covers. (2.12) 


Then for any two finite unrefinable chains U = (u = uo,...,Un = v) and V = 
(u = U0, .--,; Um = V) from u to v, we have 
n—1 m—1 
Dei, wig) = >* wv, vig) (2.13) 
i=0 i=0 


and n = m (the finite summation is taking place in the additive monoid (M.+)). In 
particular, 4 extends to a well-defined interval measure ft: Ap —> M. 


Proof If n = 0 or 1, then U = V, and the conclusion holds. We therefore pro- 
ceed by induction on the minimal length h(u,v) of an unrefinable chain from 
u to v for any algebraic interval [uv, v]—that is, the height of [u, v]. It suffices 
to assume U = (u = uo,U1,...,Un = v) is such a minimal chain (so that 
n = h(u,v)) and prove Eq.(2.13), and n = m for any other unrefinable chain 
V=(u= U0, V1,.--,;Um = DV). 

If Unp-1 = Vm—1, ACU, Vn_-1) = n — 1, so by induction, pie (Uj, Uj44) = 
pia p(v;, ¥j41) and m — 1 =n — 1 and the conclusion follows. 

So assume Uy—1 A Vm—1. Set Z = Ug—1 AUVm—1. Since (Up—1, v) and (vm—1, v) are 
both covers, by semi-modularity, so also are (z, u,—1) and (z, Vm—1) (see Fig. 2.2). 


Fig. 2.2. The main figure for Un = VU = Um 


the Jordan Hélder Theorem aa 
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Since u < Z, (Z, U,—1) is a cover, and [u, vp—1] is algebraic, we see that Lemma 
2.5.1 implies that [u, z] is also algebraic. So a finite unrefinable chain 


Z = (U = 20, Z1,--+5 Zr = 2) 


from u to z exists. Since h(u, up»_1) =n — 1 <n, by induction 


r—1 n-1 
(= a Un—1) = > Mui, Mit) (2.14) 


i=0 i=0 


andr+1=n-—1.Buth(u, vm—1) < r+1 =n -—1 so induction can also be applied 
to the algebraic interval [uv, v,,-1] to yield 


r-1 m—1 
(= Mi x0) + UG, Um—1) = >) wv, Vi41)- (2.15) 


i=0 i=0 


and the fact that r + 1 =m — 1. Thusn =m. 

But by (2.12), W(zZ, Un-1) = !Wm-1, v) and u(Z, U¥n—1) = (Un-1, v). The 
result (2.13) now follows from (2.14) and the commutativity of (M, +). The proof 
is complete. 


To see how this theorem works in a semimodular lattice in which not all intervals 
are algebraic, the reader is referred to Example 15 on p. 53 and the remark following. 

There are many applications of this theorem. In the case of finite groups 
(R-modules) we let (M, +) be the commutative monoid of multisets of all iso- 
morphism classes of finite simple groups (or all isomorphism classes of irreducible 
R-modules, resp.). These are the classical citations of the Jordan-Hélder Theorem 
for Groups and R-modules.”° 


2.5.4 Modular Lattices 


Consider for the moment the following example: 


Example 15 The poset (P, <) contains as elements P = {a, b,c} U Z, where Z is 
the set of integers, with ordering defined by these rules: 


20We beg the reader to notice that in the case of groups there is no need for Zassenhaus’ famous 
“butterfly lemma”, nor the need to prove that subnormal subgroups of a finite group form a lattice. 
A lower semilattice will do. One of the classic homomorphism theorems, provides the semimodular 
function from Covp to finite simple groups. The result is then immediate from Theorem 2.4.2, 
Eq. (2.13), where the interval measure displays the multiset of “chief factors” common to all satu- 
rated chains in P. 
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. a <p forall p ¢ P—ic. a is the poset “zero”, 

. c > p forall p € P, soc is the poset “one”, 

. bis not comparable to any integer in Z, and 

. the integers Z are totally ordered in the natural way. 


BWNeR 


Notice that the meet or join of b and any integer are a and c respectively. Thus (P, <) 
is a lattice with its “zero” and “one” connected by an unrefinable proper chain of 
length 2 and at the same time by an infinite unrefinable chain. The lattice is even 
lower semimodular since the only covers are (a,b), (b,c) and (n,n + 1), for all 
integers n. 


Remark For the purposes of the Jordan-Ho6lder theory, such examples did not bother 
us, for the J-H theory was phrased as an assertion about measures which take values 
on algebraic intervals—that is to say, non-algebraic intervals could be ignored— 
and the calculation of the measure used only finite unrefinable chains. In Example 
15, the only intervals of the semimodular lattice (P, <) which are not algebraic 
are those of the form [n, c], n € Z. The Jordan-Hélder Theorem is valid here: In 
fact if f is any function from the set of covers, into a commutative monoid M, 
then by Theorem 2.5.2, f extends to an interval measure pp : Ap —> M. Note that 
U(a,c) = f(a, b) + fb,c). 


However, unlike Example 15, many of the most important posets in Algebra are 
actually lattices with a property called “the modular law”, which prevents elements 
x and y from being connected by both a finite unrefinable chain and an infinite 
chain. This modular law always occurs in posets of congruence subalgebras which 
are subject to one of the so-called “Fundamental Theorems of Homomorphisms”’. 

Recall the definition of lattice (p. 46). A lattice L is called a modular lattice or L 
is said to be modular if and only if 


(M) for all elements a, b,c of L witha > b, 
aN(bVc)=bV (adc). 
The dual condition is 
(M*) for all elements a, b,c of L witha < b, 
aV(bAc)=bA(aVec). 


But by transposing the roles of a and b it is easy to see that the two conditions are 
equivalent. Thus 


Lemma 2.5.3 A lattice satisfies (M) if and only if it satisfies (M* ). Put another way: 
a lattice L is modular if and only if its dual lattice L* is. 


An immediate consequence of the modular law is the following 
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Lemma 2.5.4 (Identification Principle) Let x, y and n be elements of the modular 
lattice L. Suppose 


xVn=yVn, and (2.16) 
xAn=yAn. (2.17) 


Then either x = y or else x and y are incomparable. Equivalently, either x < y 
or y <x implies x = y. 


Proof Assume the hypotheses and assume x and y are comparable. Because x and 
y play symmetric roles in the hypothesis, we may assume that x < y. Now by the 
modular law (M): 

yA(XVaA)=xXVVAN). (2.18) 


But by Eq. (2.16), the left side of (2.18) is y. On the other hand, by Eq. (2.17), the 
right side of (2.18) is x. Thus x = y. 


The most important property of a modular lattice is given in the following: 


Theorem 2.5.5 (The Correspondence Theorem for Modular Lattices) Suppose L 
is a modular lattice. Then for any two elements a and b of L, there is a poset 
isomorphism 

Lu: [a,aVb] > [a Nb, bl, 


taking each element x of the domain to x / b. 


Proof As jis defined, u(a) =a Ab, wlavV b) = (aVb) Ab=b,andifa<x< 
y <b, then w(x) = x Ab < yAb= p(y). Thus p is poset homomorphism and 
takes values in [a A D, bj. 

Suppose now that x, y € [a,a V b] and u(x) = u(y). Then x Ab = yA b and 
soav (x Ab) =aVv (yA b). Since a < x anda < y we may apply the law (M) to 
each side of the last equation. This yields x = x A (aV b) = yA (aV b) = y, since 
a V b dominates both x and y. Thus yp is injective. 

Now suppose x is an arbitrary element of the interval [a A b, b], that is,a Ab < 
x < b. Then by the first inequality, x = x Vv (b A a). Since x < b, applying (M*) 
(with x and a in the role of a and c in (M*)), x = bA (x V a) = w(x V a). Thus 
is onto. 

Finally, we must show that for any x, y € [a, a V b], one has p(x) < wy) if and 
only if x < y. 

Obviously if x < y, then clearly x Ab < y Ab, so w(x) < WY). 

Conversely, suppose fu(x) < p(y). Then x A b < y AD, giving us 


aV(x Ab) <aVv(yAD). (2.19) 
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Since a < x anda < y, applying (M"*) to each side of (2.19) yields 
xA(avb)<yA(avb) 


which gives us x < y, since x and y are in [a,a Vv D]. 
Thus p is a poset isomorphism. 


Recognizing the Chain Conditions in Modular Lattices 


Lemma 2.5.6 Assumea < b < cisachainina modular lattice L. Then the interval 
[a, c] has the ACC (DCC) if and only if both intervals (a, b] and [b, c] have the ACC 
(DCC). 


Proof Recall from Corollary 2.3.7 that if [a, c] has either of the two chain conditions, 
then its subintervals [a, b] and [b, c] also possess the same condition. 

So we assume that the two intervals [a, b] and [b, c] possess the ascending chain 
condition (ACC). We must show that [a, c] has the ascending chain condition. By 
way of contradiction, suppose 


a=co<cj <c2<::: 


is an infinite properly ascending chain in the poset [a, c]. Then we have ascending 
chains 
coVb<caVb<--- andcoAb<cjAb<.:-:- 


in posets [b, c] and [a, b], respectively. Since these two posets are assumed to possess 
the ACC, there exists a natural number k such that for every integer m exceeding k 
we have 


cr V b = Cm Vb, (2.20) 
ce Ab = Cm AD. (231) 


Since cx < Cj, the Identification Principle, Lemma 2.5.4, forces cy = Cc». But that 
is impossible since these are distinct entries in a properly ascending chain. 

The argument that the presence of the descending chain condition for both [a, b] 
and [b, c] implies the same condition for [a, c] now follows by duality from our result 
for the ACC. (It can also be proved by considering an infinite properly descending 
chain cg > c, > ---, and again obtaining a natural number k such that Eqs. (2.20) 
and (2.21) hold and again invoking the Identification Principle.) 


Lemma 2.5.7 Assume L is a modular lattice and {a,, ..., Gn} is a finite subset of 
E. 


(i) Suppose c < aj for alli = 1,2,...,n. Then [c, a, V +++ V dy] has the ACC 
(DCC) if and only if each interval [c, a;] possesses the ACC (DCC). 
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(ii) Suppose aj < c for alli = 1,2,...,n. Then [ay A +--+ A dy, c] has the ACC 
(DCC) if and only if each interval [a;, c] has the ACC (DCC). 


Proof Part (i). We assume c < q; for all i. Of course, by Corollary 2.3.7, if [c, a1 V 
az V «++ V ay] has the ACC (DCC) then so does any of its intervals [c, a;]. So we 
need only prove the reverse implication. 

Assume each interval [c, aj],i = 1, 2,...,, possesses the ACC (DCC). A simple 
induction on n reduces us to the case that n = 2. By Theorem 2.5.5, [a2, a) V a2] ~ 
[a1 Aa2, a;] which has the ACC since it is an interval of [c, a,] which is hypothesized 
to have this chain condition. But the interval [a A a2, a2] is a subinterval of [c, a2] 
which has the ACC (DCC). So we see that both intervals [a Adz, a2] and [a2, a, Vaz] 
have the ACC (DCC) and so by Lemma 2.5.6, the interval [a} A a2, a1 V a2] also 
possesses the ACC (DCC). Finally, noting that [c,a, A a2] has the ACC (DCC) 
because it is an interval of [c, a;] hypothesized to have this chain condition, one 
more application of Lemma 2.5.6 now yields the fact that [c, a1 V az] also enjoys 
this condition. 

Part (ii) follows from Part (i) by duality. 


Corollary 2.5.8 Any modular lattice is a semi-modular lower semilattice and so 
is subject to the Jordan-Holder theory (See Theorem 2.5.2.). In particular, any two 
finite unrefinable chains that may happen to connect two elements x and y of the 
lattice, must have the same length, a length which depends only on the pair (x, y). 


Proof Apply the previous Lemma for the case [a, a V b] and [b,a V b] are both 
covers. 


The preceeding Corollary only compared two finite unrefinable chains from x to 
y. Could one still have both a finite unrefinable chain from x to y as well as an infinite 
one as in Example 15 at the beginning of this subsection? The next result shows that 
such examples are banned from the realm of modular lattices. 


Theorem 2.5.9 Suppose L is a modular lattice and suppose a = ay < aj <+-: < 
Gyn = bis an unrefinable proper chain of length n preceding from a to b. Then every 
other proper chain proceeding from a to b has length at most n. 


Proof Ofcourse, by Zorn’s lemma, any proper chain is refined by a proper unrefinable 
chain. So, if we can prove that all proper chains connecting a and D are finite, such a 
chain possesses a well-defined measure by the Jordan-Hélder Theory. In particular 
all such chains possess the same fixed length n. So it suffices to show that any proper 
chain connecting a and b must be finite. 

We propose to accomplish this by induction on the parameter 1 given in the 
statement of the theorem. But this means that we shall have to keep track of the 
bound on length asserted by the induction hypothesis, in order to obtain new intervals 
[a’, b’] to which induction may be applied. 

So we begin by considering a (possibly infinite) totally ordered subposet X of 
(P, <) having a as its minimal member and b as its maximal member. We must 
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show that X is a finite proper chain. If n = 1, then {a, b} = X is a cover, and we are 
done. 

If each x € X satisfies x < a,_1, then X — {b} is finite by induction on n, and so 
X is finite. So we may suppose that for some xg € X, we have xg V dy_1 = b. By 
Theorem 2.5.5 (dn—1 A Xg, Xg) 18 a cover. 

Now the distinct members of the set 


Y = {zAay)_1|z € X} 


form a totally ordered set whose maximal member is a,—1 = b A an—1, and whose 
minimal member is a = a A ap_}. By induction on n, Y is a finite set of at most 
n — | elements. On the other hand, it is the concatenation of two proper chains 


Yt: = {y €Vly > an_1 A xg} 
Y :={y Ee Vly < ay_-1 A xg}, 
whose lengths are non-zero and sum to at most n — 1. 


Now the poset isomorphism pz : [xg, b] > [dn—1 A xg, n—1] takes the members 
of X dominating x3 to YT. Thus 


XM [xg, b] 
is a chain of the same length as Y*. It remains only to show that the remaining part 
of X, the chain X /M [a, xg] is finite. 
Let index i be maximal with respect to the condition a; < xg. Then for each index 
j exceeding i but bounded by n — 1, 


aj = aj-1 V (aj A Xg) and (aj-1 A Xg, xj; A xg) 


is a cover or is length zero. It follows that a is connected to xg by an unrefinable chain 
of length at most n — 1. By induction X /M [a, xg] is finite. The proof is complete. 


Corollary 2.5.10 Let L be a modular lattice with minimum element 0 and maximum 
element 1. The following conditions are equivalent: 


1. There exists a finite unrefinable chain proceeding from 0 to 1. 
2. Every proper chain has length below a certain finite universal bound. 
3. L has both the ascending and descending chain conditions (see p. 48). 


Proof The Jordan-Hdlder theory shows all chains connecting 0 to | are bounded by 
the length of any unrefinable one. How this affects all proper chains is left as an 
exercise. 


Any modular lattice which satisfies the conditions of Corollary 2.5.10 is said to 
possess a composition series. 
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Example 16 Let L be the lattice D of all positive integers where a < b if and only if 
a divides b. Then D is a modular lattice which possesses the DCC but not the ACC. 
There is a “zero” (the integer 1), but no lattice “one”. However, every interval [a, b] 
possesses a composition series. 


2.6 Dependence Theories 


2.6.1 Introduction 


If we say A “depends” on B, what does this mean? In ordinary language one might 
intend several things: “A depends entirely on B” or that “A depends just a little on 
B”—a statement so mild that it might suggest only that B has a “slight influence” on 
A. But some syntax seems to be applied all the same: thus if we say that A depends 
on B and that B in turn depends on C, then A (we suppose to some small degree) 
depends on C. 

In mathematics we also usually use the word “depends” in both senses. When one 
asserts that f(x, y) depends (to some degree) upon the variable x, one means only 
that f might be influenced by x. After all, it could be that f is a “constant” function 
as far as x is concerned. But on the other hand the mathematician also intends that f 
is entirely determined by the pair (x, y). Thus we may consider the phrase “f(x, y) 
depends on x” as one borrowed from ordinary everyday speech. In its various guises, 
the stronger idea of total and entire dependence appears throughout mathematics 
with such a common strain of syntactical features as to deserve codification. 

But as you will see, the theory here is highly special. 


2.6.2 Dependence 


Fix a set S, and let F(S) be the set of all finite subsets of the set S. A dependence 
relation on S is arelation D from S to F(S)—that is, a subset of S x F(.S)—subject 
to the axioms (D1)-(D3) listed below. We shall say that the element s in S depends 
on the subset A € F(S) if and only if (s, A) ¢ D. The dependence relation must 
satisfy these conditions: 


(D1) (The Reflexive Condition) If s € S and ifs ¢ A € F(S), then s depends on 
A—that is, s depends on any finite subset of S that contains it. 

(D2) (The Transitivity Condition) If the element s depends on the finite set A, and 
if every member of A depends on the finite set B, then s depends on B. 
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(D3) (The Exchange Condition) Ifs, a1, ..., a, are elements of S such that s depends 
on the set {a1, ..., G,} but does not depend on the set {a1, ... .d,—1}, then ay 
depends on the set {a1,..., Gy—1, S}. 


Before concluding anything from this, let us consider some examples. 


Example 17 (Linear dependence) Suppose V is a left vector space over some field 
or division ring D. Let us say that a vector v depends on a finite set of vectors 
{v1,..., Un} if and only if v can be expressed as a linear combination of those 
vectors,—that is, ye ,0iv; = v for some choice of 6; in D. One checks that this 
defines a dependence relation on V. 


Example 18 (Algebraic dependence) Let K be a field containing k as a subfield. (For 
example, one might let K and k respectively be the complex and rational number 
fields.) We say that an element b of K depends on a finite subset X = {aj,....dn—1} 
of K if and only if b is the root of a polymonial equation whose coefficients are 
expressible as a k-polynomial expressions in X. This means that there exists a poly- 
nomial p(x, X1,...,Xn) in the polynomial ring k[x, x1, ..., Xn] when evaluated at 
x = b,x; =aj;,1 <i <n yields 0. We shall see in Chap. 11 that all three axioms of 
a dependence theory hold for this definition of dependence among elements of the 
field K. 


2.6.3 Extending the Definition to S x P(S) 


Our first task will be to extend the definition of a dependence relation D C S x F(S) 
to a relation D C § x P(S), where P(S) is the collection of all subsets of S (the 
power set). Thus, it will be possible to speak of element x depending on a (possibly 
infinite) subset A of S. Namely, for any subset A of S, we say that the element x 
depends on A if and only there exists a finite subset Aj of A such that x depends 
on A, (in the original sense). We leave it as an exercise to the reader to prove the 
following: 


Lemma 2.6.1 Jn the following, the sets A and B may be infinite sets: 


(i) Element x of S depends on any set A which contains it. 
(ii) If A and B are subsets of S, if x depends on A and every element of A depends 
on set B, then x depends on B. 
(iii) Ifa € A, a subset of S, and x is an element of S which depends on A but does 
not depend on A — {a}, then a depends on (A — {a}) U {x} 


(See Exercise (1) in Sect. 2.7.4.) 
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2.6.4 Independence and Spanning 


Let S be a set with dependence relation D C S x F(S), and let D C S x P(S) be 
the extension of this relation to infinite subsets as in Lemma 2.6.1 above. Now let A 
be any subset of S. The flat generated by A is the set (A) := {y|y depends on A}. 
A subset X of S is called a spanning set if and only if (X) = S. 

Next, we say that a subset Y of S is independent if and only if for each element 
x of Y, x does not depend on Y — {x}. 

Now the collections of all spanning sets form a partially ordered set under the 
containment relation. A similar statement holds for the collection of all independent 
sets. Thus it makes perfect sense to speak of a minimal spanning set and a maximal 
independent set (should such sets exist). We wish to show that these two concepts 
coincide. 


Theorem 2.6.2 Let S be a set with dependence relation D C S x P(S) as above. 


(i) Every minimal spanning set is a maximal independent set. 
(ii) Every maximal independent set is a minimal spanning set. 


Proof We prove (i). Suppose U is a minimal spanning set. If U is not independent 
then there exists an element u in U such that u depends on a finite subset U; of 
U — {u}. Thus every element of S' depends on U and by Lemma 2.6.1, part 1, and 
what we know of u, every element of U depends on U — {u}. Thus by Lemma 2.6.1, 
part (ii), every element of S depends on U — {u}. Thus U — {u} is a spanning set, 
against the minimality of U. Thus U must be independent. 

If U were not a maximal independent set there would exist an element z in S— U 
which did not depend on U. But that is impossible as U spans S. 

Next we show (ii). If X is a maximal independent set, then by maximality, X spans 
S. If a proper subset X; of X also spanned S, then any element x of X — X, would 
depend on X; and so would depend on X — {x} by the definition of dependence. 
But this contradicts the fact that X is independent. Thus X is actually minimal as a 
spanning set. The proof is complete. 


Next, we have the following important result. 
Theorem 2.6.3 Maximal independent subsets of S exist. 


Proof This is a straightforward application of Zorn’s Lemma. As remarked above, 
the collection 7 of all independent subsets of S form a partially ordered set (7, <) 
under the inclusion relation. Now consider any chain C = {Jq} of independent sets 
and form their union C := U a Ya. Then if C were not an independent set, there 
would exist an element x in C and a finite subset F of C — {x} such that x depends 
on F.. But since the chain C is totally ordered and F U {x} is finite, there exists an 
index o such that F Ux C J, . But this contradicts J, independent. Thus we see 
that C must be an independent set. 

Thus every chain C in 7 possesses an upper bound C in (7, <). By the Zorn 
Principle, 7 contains maximal elements. The proof is complete. 
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2.6.5 Dimension 


The purpose of this section is to establish 
Theorem 2.6.4 If X and Y are two maximal independent sets, then |X| = |Y|. 


Proof By the Schréder-Bernstein Theorem (Theorem 1.4.1, p. 16), it suffices to show 
only that |X| > |¥| for any two maximal independent sets X and Y. 

The proof has a natural division into two cases: the case X is a finite set and the 
case that X is infinite. 

First assume X is finite. We proceed by induction on the parameter k := |X| — 
[IX 1 Y|. Ifk = 0, then X = XNMY C Y. But as Y is independent and each of its 
elements depends on its subset X, we must have X = Y, whence |X| = |Y| and we 
are done. Thus we may assume k > 0. 

Suppose X = {x1,...,%n,} where the x; are pairwise distinct elements of S 
and the indexing is chosen so {x1,...,x-} = X MY, thereby making k =n —r. 
From maximality of the independent set X, every element of Y — X depends on X. 
Similarly, every element of X depends on Y. If every element of Y depended on Xo := 
{x1,-..,Xn—1}, then x,, which depends on Y, would depend on Xo by Lemma 2.6.1 
(1) above, against the independence of X. Thus there is some element y € Y—X which 
depends on X but does not depend on Xo. By the exchange condition, x, depends 
on X; := {x1,...,Xn—1, y}. Moreover, if X; were not independent, then either y 
depends on X; —{y} = {x1,..., X»—-1} or some x; depends on (Xo — {x;}) U{y}. The 
former alternative is ruled out by our choice of y. If the latter dependence held, then 
as the independence of X prevents x; from depending on Xp — {x;}, the exchange 
condition would force y to depend on Xo, again contrary to the choice of y. Thus X1 
is an independent set of 1 distinct elements. But as x, depends on X,, each element 
of X depends on X. Thus as all elements of S depend on X, they all depend on X 
as well. Thus X, is a maximal independent set. But since |X; NY|=1+ |X Y\, 
induction on k yields |X| > |Y|. The result now follows from |X| = |Xj|. 

Now assume X is an infinite set. Since Y is a maximal independent set, it is a 
spanning set. Thus for each element x in X there is at least one finite non-empty 
subset Y, of Y on which it depends. Choose, if possible y € (Y — UyYx). Since y 
depends on X and by construction every element of X depends on Ly Yx, we see 
that y depends on Uy Yx C Y, contradicting the independence of Y. Thus we see 
that no such y can exist and so Y = Uy Yy. 

Our goal is to produce an injective mapping 


o:Y>xxN, 


where, as usual, N denotes the natural numbers. Since, for each x € X, the set Y, 
is finite, there exists an injective mapping ¢, : Y, — N. The problem is that the 
sets Y, may intersect non-trivially, so that merely combining the ¢, does not yield a 
well-defined ¢. 

We get around this problem as follows: First, by Theorem 2.2.2, the set X can be 
well-ordered. Next given y € Y, let S(y) := {x € X|y € Y,}, and let €(x) be the 
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least element of the set S(y). Now define 6: Y > X x N, by setting 


Py) = (L(y), beg (y)) € X x N (2.22) 


for each y € Y. If (1) = (2), then €(y1) = €(y2) and then y; = y2 follows 
from the injectivity of decy). Thus the mapping ¢ defined in (2.22) is injective. 
It follows that 
IY] <|X x NJ. 


Now by Corollary 2.2.5 near the beginning of this chapter, |X| = |X x NJ, since X 
is infinite. 
Thus |Y| < |X| as required. The proof is complete. 


This common cardinality of maximal independent sets is called the dimension of 
the dependence system (S, D). 


2.6.6 Other Formulations of Dependence Theory 


For the sake of completeness, this subsection surveys a number other views of depen- 
dence theory. Since this subsection is not essential for anything further in the book, 
many results are not proved. Most of the missing proofs can be found in the book by 
Oxley [1]. 

Fix a set X and let Z be a family of subsets of X. The pair (X, Z) is called a 
matroid if and only if the following axioms hold: 


(M1) The family Z is closed under taking subsets. 

(M2) (Exchange axiom) If A, B € Z with |A| > |B|, then there exists an element 
a € A-—B, such that {a} UB eT. 

(M3) If every finite subset of a set A belongs to Z, then A belongs to Z. 


Note that if A and B are maximal elements of (Z, C), then |A| = |B| is an 
immediate consequence of axiom (M2). However, from axioms (M1) and (M2) alone, 
it does not follow that maximal elements even exist. One needs (M3) for that. 


Lemma 2.6.5 If (X,Z) is a matroid, then every element of T lies in a maximal 
element of T. 


Proof This proof utilizes Zorn’s lemma. Let Ao be an arbitrary member of Z. We 
examine the poset of all subsets in Z which contain Ag. Let A} C Az C--- bea 
chain in this poset and let B be the union of all the sets A;. Consider any finite subset 
F of B. Then each element f € F lies in some member Aj, f) of the chain. Setting 
m to be the maximal index in the finite set {i(f)|f € F}, we see that F C Ay. 
Since A, € Z, (M1) implies F € Z. But since F was an arbitrary finite subset of 
B, (M3) shows that B € Z. Thus every finite chain in the poset of elements of Z 
which contain Ao has an upper bound in that poset, so, by Zorn’s lemma, that poset 
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contains a maximal element M. Clearly M is a maximal member of Z, since any 
member of Z which contains M also contains Ao, and so lies in the poset of which 
M is a maximal member. Thus Ao lies in a maximal member of Z, as required. 


Example 19 Let D C X x F(S) be a dependence theory. Let Z be the set of inde- 
pendent sets of the dependence theory D. Then (X, Z) is a matroid. 


Example 20 Suppose (V, E) is a simple graph with vertex set V and edge set E. 
(The adjective “simple” just means that edges are just certain unordered pairs of 
distinct vertices.) A cycle is a sequence of edges (ep, €1, ..., @n) Such that e; shares 
just one vertex with ej41, and another vertex with e;_1, indices i taken modulo n. 
Thus e,—1 shares a vertex with e, = eg, and the “length” of the cycle, n, cannot be 
one or two. A graph with no cycles is called a forest—its connected components are 
called trees. Now let Z be the collection of all subsets A of E such that the graph 
(V, A) is a forest. Then (E, Z) is a matroid. 


Example 21 Let F be a fixed family of subsets of a set X. A finite subset 
{x1,X2,...,Xn} of X is said to be a partial transversal of F if and only if there 
are pairwise distinct subsets Aj,...Ay, of F such that x; € Aj,i = 1,...,n— 
that is, the set {x,,..., x,} is a “system of distinct representatives” of the sets {A;}. 
Now let Z be the collection of all subsets of X all of whose finite subsets are partial 
transversals. Then (X, Z) is a matroid. 


Example 22 Suppose (X, Z) is any given matroid, and Y is any subset of X. Let 
TZ(Y) be all members of Z which happen to be contained in the subset Y. Then it 
is straightforward that the axioms (M1), M(2) and (M3) all hold for (Y, Z(Y)). The 
matroid (Y, Z(Y)) is called the induced matroid on Y. 


Here is another approach to matroids, using a so-called rank function. 


Theorem 2.6.6 Suppose r is a map from the set of all subsets of a set X, into the 
cardinal numbers, satisfying these three rules: 


(R1) For each subset Y of X,0 < p(Y) < |¥|. 
(R2) (Monotonicity) JfY; C Y2 C X, then p(%) < p(%2). 
(R3) (The submodular inequality) [f A and B are subsets of X, then 


p(A UB) + p(AN B) < p(A) + p(B). 
Let T be the collection of a subsets Y of X with the following property: 


(*) For every finite subset Fy of Y we have |Fy| = r(Fy). 
Then (X, L) is a matroid. 
Now suppose we are given a matroid (X, Z). Can we recover a dependence theory 


from this matroid? Clearly we need to have a definition of “dependence” constructed 
exclusively from matroid notions. Consider this definition: 
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Definition: Let x be an element of X and let A be a finite subset of X. We say that 
x depends on A if and only if there exists a subset Ag of A which lies in Z, such 
that either (i) x € Ag, or (ii) {x} U Ag is not in Z. 


Theorem 2.6.7 Given a matroid (X, T), let the relation of “dependence” between 
elements of X and finite subsets of X be given as in the preceding definition. Then 
this notion satisfies the axioms of a dependence theory. 


It now follows from Example 19, and the assertion of Theorem 2.6.7, that matroids 
and dependence theories are basically the same thing. 

There are purely combinatorial ways to express the gist of a dependence theory, 
and these are the various formulations of the notion of a matroid in terms of circles, 
in terms of flats, or in terms of closure operators. 

Let us consider flats, for a moment. We have already defined them from the point 
of view a dependence theory. From a matroid point of view, the flat (A) spanned by 
subset A in matroid (X, Z), is the set of all elements x for which {x} U A, ¢ Z for 
some finite subset A, of A. It is an easy exercise (Exercise (6) in Sect. 2.7.4) to show 
that ((A)) = (A), for all subsets A of X. In fact the reader should be able to prove 
the following: 


Theorem 2.6.8 The mapping Tt which sends each subset A of X to the flat (A) 
spanned by A, is a closure operator on the lattice of all subsets of X. The image 
sets—or “closed” sets—form a lattice: 


1. The intersection of two closed sets is closed, and is the meet in this lattice. 
2. The closure of the set-theoretic union of two sets is a join, that is, a global minimum 
in the poset of all flats above the two sets. 


The characterization of matroids by closure operators is the following: 


Proposition 2.6.9 Suppose P = P(X) is the poset of all subsets of a set X and 
|: P => P satisfies 


(Increasing) For each subset A, A C T(A). 

(Closure) The mapping T is a closure operator—that is, it is an idempotent 
monotone mapping of a poset into itself. 

(The Steinitz-MacLane Exchange Property) Jf X C P and y,z € P —7T(X) then 
y € T(X U {z}) implies z € T(X U {y}). 


Then, setting 
T={ACX|x ¢ 7(A — {x}) forall x € A}, 


we have that M := (X,T) is a matroid. 

Conversely, if (X, T) is a matroid, then the mapping T which takes each subset of 
A to the flat (A) which it spans, is a monotone increasing closure operator P —> P 
possessing the Steinitz-MacLane Exchange Property. 
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2.7 Exercises 


2.7.1 Exercises for Sect. 2.2 
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1. Suppose < is a reflexive and transitive relation on a set X. (Such a relation is 


called a pseudo-order.) For any two elements x and y of X, let us write s ~ y if 
and only if x<y and also y<x. 


(a) Show that “~” is an equivalence relation. 

(b) For each element x of X, let [x] be the ~-equivalence class containing 
element x. Show that if x<y then a<b for every element a € [x] and 
element b € [y]. (In this case we write “[x] < [y]’”.) 

(c) Let X/~ denote the collection of all ~-classes of X. Show that (X/ ~, <) 
is a poset. 


. Make a full list of the posets defined in Examples | and 2 above which 


(a) have a zero element. 
(b) have a one element. 
(c) are locally finite. 


. Let P be a fixed poset. If there is a bijection f : X — Y prove that there exists 
an isomorphisms of posets: 


I]°- |]. 


xex yeY 
> P= al P, if P has a zero element. 
xex yeY 


(This means that the index set X has only a weak effect on the definition of a 
product.) 

. Recall that the elements of the divisor poset D are the positive integers, witha < b 
if and only if the integer a divides the integer b. 

The poset of all finite multisets on the set of natural numbers consists of all 
infinite sequences of natural numbers with only finitely many positive entries. 
This multiset is denoted M—.(N). 

For this exercise, the student is asked to assemble a proof of the following theorem. 


Theorem 2.7.1 The divisor poset D is isomorphic to the poset M<o.(N) of all finite 
multisets over the set N* of positive integers. That is, we have an isomorphism 


e:(D, <)> » N; (2.23) 


ieN+ 


where each N; is a copy of the totally ordered poset of the natural numbers. 
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{Hint: We sketch the argument: Notice that the right side of (2.23) consists of 
all sequences of non-negative integers of which only finitely many are positive 
(equivalently, all finite sequences of positive integers). The mapping € is easy 
to define. First place the positive prime numbers in ascending (natural) order 
2 = pi < p2 <.---. For example po = 13. Now any integer d € D greater than 
one has a unique factorization into positive prime powers, with the primes placed 
from left to right in ascending order: 


d= || vv, 


o€dg 


for some (possibly empty) ascending sequence of positive integers J,. If one 
declares €(d) to be the function f : Nt — No such that 


dz ifo € Jag, 


fo={6 ifo ¢ Ja, 


and declares €(1) := 0, the constant function with all values 0, then one sees that 
f(c) < f(d) in the sum on the right side of Eq. (2.23) if and only if c divides d 
evenly—i.e. c < d in (D, <). Clearly € is a bijection. ] 


5. Recall that if (P, <) and X C P, then the symbol Py denoted the order ideal 
Px := {z € P|z < x forsome x € X}. 


Show that Py = U{P,|x € X}. 
6. For the filters P* and P” generated by these sets, show that 


(A Pe} a (A re _ A pixXuy). 


7. Let X be a subset of a poset (P, <), and let Py be the order ideal of all elements 
bounded above by at least one element of X (see p. 31 or Exercise (5) in Sect. 2.7.1 
in this section). Prove that Py is the intersection of all order ideals of P which 
contain X. 

8. Give an example of a poset P and subset X for which the order ideal Py is not 
generated by an antichain. 

9. Give an example of a poset (P, <) and an order ideal J of P such that J does 
not have the form Py for any antichain X. 

10. Let X be an infinite set. Recall that a partition of X is a decomposition 7 = {Xz} 
of X into pairwise disjoint non-empty subsets X, called the components of the 
partition. (The word “decomposition” is there to indicate that the union of the 
X, is X.) Acomponent X,, is said to be trivial if it contains exactly one element 
of X. Such a partition 7 is said to be a finitary partition if and only if finitely 
many of the components are non-trivial and each of these is a finite set. 
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Let F P(X) be the full collection of finitary partitions of X. Show that with 
respect to the refinement relation, F P(X) is a locally finite poset. 

11. Let F := {A,|o € I} C P(X). Let a : P(X) — PU) be the mapping of 
posets which sends each subset U of X to the seta(U) := {0 € 1|U C Ag}. (if 
no such subset A, contains U, then a(U) is the empty set.) Similarly, for each 
subset K of J, set G(K) := () Ag so that G : PU) > P(X) is a mapping of 

oceK 

posets. 

(a) Show that a and ( are both order reversing and that (P(X), P(Z), a, 3) 
is a Galois connection. 

(b) Show that the closed elements of X are those subsets expressible as inter- 
sections of A,’s. 

(c) Show that a subset J of J is closed if and only if it has the property: If 
A; > (| Ao, then r € J. 


aed 


2.7.2 Exercises for Sects. 2.3 and 2.4 


1. We let [n] be the chain {1 < 2 < --- < n} in the usual total ordering of the 
positive integers. Show that the infinite union of disjoint chains, 


[JU [2Z]U[3]U---, 


satisfies (FC) but does not possess finite height, and possesses neither a i nora 
0. 

2. If we adjoin 0 to the poset presented in the previous Exercise, show that the 
Frattini element exists, and is 0. 

3. Show that the product of posets. 


[1] x [2] x [3] x---, 


does not satisfy (FC). 

4. Let P be a lower semilattice with 0 and 1 satisfying (FC). Set 6 := max(P) and 
let P be the induced subposet generated by B—that is, the set of elements of P 
expressible as a finite meet of elements of B. We understand the empty meet to 
be the element i, so the latter is an element of P. 


(a) Show that P has the Frattini element (P) as its zero element, ) Pp 
(b) For each x € P — {1}, show that the induced poset P* N P has a unique 
minimal member (which we shall call o(x)). 
(c) Defining o(1) = j show the following: 
i. For all x € P, x < a(x). (This is built into the definition of o.) 
ii. The mapping o : P — P isa surjective morphism of posets. 


2.7 


10. 


11. 
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iil. For each x € P, one has o(a(x)) = a(x). 
(Recall that any morphism onto an induced subposet which satisfies these 
three conditions is called a closure operator.) 


. Suppose P = N x {0, 1}. We totally order P (lexicographically) as follows: 


(a) (a, 0) < (b, 1) for anya, b EN. 
(b) (a, 0) < (b, 0) if and only ifa < b,a,beEN. 
(c) (a, 1) < (, 1) ifand only ifa < b,a,beN. 


Show that (P, <) has the Descending Chain Condition (DCC), but that there 
exist intervals (x, y) with no finite unrefinable chain from x to y. 


. Suppose (P, <) is a poset with a zero element 0. Recall from Sect. 2.4.2 that in 


this case an atom of (P, <) is an element a distinct from 0 with the property 
that there is no element x € P such that 0 < x < a,—that is, (0, a) is a cover. 
Let A be the set of atoms of P. Assume now that (P, <) has the property that 
every interval (0, b] possesses the DCC. Show that either P = {0} or that the set 
A of atoms is non-empty. 


. Give an example of a locally finite poset which does not possess the descending 


chain condition. 


. Suppose (P, <) is a locally finite poset which possesses a zero element 0. Show 


by means of an example, that such a poset need not possess the descending chain 
condition. 


. Suppose L is a lattice. Give an induction proof showing that for any finite col- 


lection {a1,...,a,} of elements of L, the elements 
a, V-+-Va,anda, A-:-Aady, 


exist and are respectively the greatest lower bound and greatest upper bound in 
L of the set {a,,..., Gn}. 

Suppose (P, <) is a lower semilattice with the descending chain condition 
(DCC). Show that any principle order ideal is a lattice. Conclude that for any 
non-empty subset X C P, the order ideal 


Py := {z € P|z < x forall x € X} 


is always a lattice. 
Suppose (P, <) is a locally finite poset. 


(a) Suppose (P, <) possesses a zero element 0 and at least one other element. 
Show that the set A of atoms of (P, <) is not empty. (Note that we are not 
assuming the Descending Condition, so this is a little different than Problem 
(6) in Sect. 2.7.2.) 

(b) Now assume that (P, <) is a lower semilattice. Show that any principle 
order ideal is a finite lattice. 
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12. A lattice L := (L, <) is said to be distributive if and only if, for any elements 
a and b in L, one always has: 


an(bVc)=(aanb)V (adc). (2.24) 


An example of a distributive lattice is the power poset P(X) of all subsets of a 
set X. 


(a) Prove that if L is a distributive lattice and (M, <) is an induced subposet 
closed under taking pair-wise meets and joins, then M is also a distributive 
lattice. 

(b) Use the result of item (a) to prove the following: 

i. The Boolean poset B(X)—of all finite subsets of X is a distributive 
lattice. 

ii. Let 7(P) be the poset of all order ideals of a poset (P, <) under the 
containment relation. Use item (a) to prove that 7(P) is a distributive 
lattice. [One must define “meet” and “‘join” of order ideals and set 
L = P(P) in item (a) of this exercise. ] 


2.7.3 Exercises for Sect. 2.5 


1. (The Jordan-H6lder Theorem implies the Fundamental Theorem of Arithmetic.) 
Let (P, <) be the poset of positive integers where a < b if and only if integer a 
divides integer b evenly (Example 1, of this chapter). 


(a) Show that (P, <) is a lower semimodular semilattice with all intervals alge- 
braic. Let 4: : Coup — Nt be the function which records the prime number 
b/a at every cover (a, b) of (P, <). Indicate why p is a semimodular func- 
tion in the sense given in Eq. (2.12) of the Jordan-Hélder Theorem. 

(b) Suppose integer a properly divides integer b. For every factorization b/a = 
Pip2:-: pr of b/a into primes show that there exists a finite unrefineable 
chain (a = ao, a1, 42, ...ay = b) such that aj /aj_; = p; fori = 1,...r. 

(c) Conclude from the Jordan-Hélder Theorem that every positive integer pos- 
sesses a factorization into prime numbers, and that the multiset of positive 
prime numbers involved in the factorization is unique. 


2. Show that the following lattices are modular: 


(i) The poset of subspaces of a vector space. 
(ii) The poset of subgroups of an abelian group. 


In both cases, the order relationship is that of containment. 
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Show that the product L; x Lz of two modular lattices is modular. If the lattices 
L,, o € I are modular and possess a “zero”, does the modularity extend to the 
direct sum al Lo? 


cel 
Is the lattice of partitions of a finite set, modular? (Here the order relationship is 


refinement of one partition by another.) 
Write out an explicit proof of Corollary 2.5.10. 


2.7.4 Exercises for Sect. 2.6 


oad 


. Prove the three parts of Lemma2.6.1 for the extended definition of dependence 


allowing infinite subsets of S. 

Verify the axioms of a matroid for Example 22, on the edge-set of a graph. 
Verify the axioms of a matroid for Example 23, on finite partial transversals. 
Letl = (V, E) beasimple graph. Let M = (E, T) be the matroid of Example 20. 
For any subset F of the edge set E, let 


(V, F) = (J (Vo, Fo) 


o€K 


be a decomposition of I’ into connected components. For each connected com- 
ponent (V,, F,), let E, be the collection of edges connecting two vertices in V,. 
Thus (V,, E,) is the subgraph induced on the vertex set V,. Prove that in the 
matroid, the flat spanned by F is the union Uzex Eg. 

Let M = (X,T) be a matroid. A set which is minimal with respect to not lying 
in Z is called a circuit. Show that any circuit is finite. (Remark: There is also a 
characterization of matroids by circuits. See Matroid Theory, by James G. Oxley 
[1], Proposition 1.3.10.) 

Let M = (X, Z) be a matroid. For each subset A of X, let (A) be defined as the 
set of all elements x € X for which there exists a finite subset A, C A such that 
{x} UA, ¢ Z (this is the matroid version of “the flat generated by A” defined on 
p. 65. Show that (A) = ((A)). 
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Chapter 3 
Review of Elementary Group Properties 


Abstract Basic properties of groups are collected in this chapter. Here are exposed 
the concepts of order of a group (any cardinal number) or of a group element (finite 
or countable order), subgroup, coset, the three fundamental theorems of homomor- 
phisms, semi-direct products and so forth. 


3.1 Introduction 


Groups are systems of symmetries of objects, in particular mathematical objects. 
Understanding groups can be useful in classifying objects in a particular class. One 
uses a group of symmetries to transfer any object of the class to a representative 
object which is an easily-studied canonical form. For example, the group GL(n, F’) x 
GL(n, F), generated by elementary row and column operations on the class M,,(F’) 
of n x n matrices is used to transport an arbitrary matrix to a more easily studied 
canonical form. 

This is just one of the reasons that group theory is needed, whatever field of 
mathematics you might choose to enter. There are of course many other reasons 
having to do with special uses of groups (such as “Polya counting”,! quantum physics, 
invariant theory etc.). But overall, the need to classify things seems the broadest 
reason that group theory is pertinent to all of mathematics. 

Groups are defined by a hallowed set of axioms which are useful only for the 
purpose of banishing all ambiguities from the subject. Somehow, staring at the axioms 
of a group mesmerizes one away from their basic and natural habitat. Every group 
that ever exists in the world is in fact the full group of symmetries of something. That 
is what the theory of groups is: the study of symmetries. But it is useful to know this 
only on a philosophical plane. Knowing that there is a set of objects such that every 
group—or even every finite group—is the full group of automorphisms of at least 
one of these objects, is not very helpful for classifying the groups themselves. 


'This has a long pre-Polya history. 
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3.2 Definition and Examples 


This is probably a good place to explain a convention. A group is a set G with a binary 
operation with certain properties—of which there are a good many examples. In some 
of these examples the binary operation is written as “++”; in others, it is written as “o” 
or “x”. (If one uses “+”, one might think the operation is commutative. If one uses 
“o”, the elements seem to be mappings, and if one uses “x”, Cartesian products can 
confuse the issue). How then does one speak generally about all possible groups? The 
standard solution is this: The group operation of an arbitrary group, will be indicated 
by simple juxtaposition, the act of writing one symbol directly after the other—thus, 
the group operation, applied to the ordered pair (x, y) will be denoted xy. 

A group is a set G equipped with a binary operation G x G — G (denoted here 


by juxtaposition) such that the following axioms hold: 


1. (The Associative Law) The binary operation is associative—that is (ab)c = a(bc) 
for all elements a, b, c of G. (The parentheses indicate the temporal order in which 
operations were performed. This axiom more-or-less says that the temporal order 
of applying the group operations doesn’t matter, while, of course, the left-right 
order of the elements being operated on does matter.) 

2. (Identity Element) There exists an element, say e, such that eg = g = ge for all 
elements g in G. 

3. (The Existence of Inverses) For each element x in G, there exists an element x’ 
such that x’x = xx’ = e where e is the element referred to in Axiom 2.7 


One immediately deduces the following: 


Lemma 3.2.1 For any group G one has: 


1. The identity element e is the unique element of G possessing the property of Axiom 
2. 

2. Given element g in G, there is at most one element g! such that gg’ = g'g =e. 

3. In light of the preceding item 2, we may denote the unique inverse of g by the 
symbol g~!. Then note that 


(i) (ab)~! = (b~!)(a7!) for all elements a, b € G. 
(ii) (a~!)~! =a, for each element a € G. 
(iii) For any element x and natural number k, the product xx ...x (withk factors) 
is a unique element which we denote as x*, We have (x*)-! = (x71, (As 
a notational convention, this element is also written as x—*. ). 


People have figured out how to “improve” these axioms, by hypothesizing only right identities 
and left inverses and so on. There is no attempt here to make these axioms independent or logically 
neat. Of course, the axioms indeed over-state their case; but a little redundancy won’t hurt at this 
beginning stage. 
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Example 23 Familiar Examples of Groups: 


(a) 
(b) 
(c) 
(d) 


(e) 


(f) 


(g) 


(h) 


The group (Z, +) of integers under the operation of addition. 

The group of non-zero rational numbers under multiplication. 

The group of non-zero complex numbers under multiplication. 

The cyclic group Zn of order n. This can be thought of as the group of rotations of 
a regular n-gon. This group consists of the elements {1, x, x7 ..., x”~!} where 
x is a clockwise rotation of 27/n radians and “1” is the identity transformation. 
All multiplications of the powers of x can be deduced solely from the algebraic 
identity x” = 1. 

The dihedral group D2, of order 2n. This is the group of all rigid symmetries of 
the regular n-gon. In addition to the group of n rotations just considered, it also 
contains n reflections which are symmetries of the n-gon. If n is an odd number, 
these reflections are about an axis through a corner or vertex of the polygon and 
bisecting an opposite side. If n is even, the axes of the reflections are of two 
sorts: those which go from a vertex through its opposite vertex, and those which 
bisect a pair of opposite sides. There are then n/2 of each type. The elements of 
the group all have the form t'x/, where f is any reflection, x is the clockwise 
rotation by 27/n radians and 0 <i < 1 andO < j <n —1. The results of 
all multiplications of such elements can be deduced entirely from the relations 
xv =1et=1txax!t (and its consequence, txt= Xt). 

The isomorphism type of the dihedral group of order 2n is denoted D2,. In the 
special case that n = 2, each group in the resulting class D4 is called a fours 
group. It is distinguished from the cyclic group of order four by the fact that the 
square of any of its elements is the identity element. 

The symmetric groups Sym(X). Let X be a set. A permutation of the elements 
of X is a bijective mapping X — X. This class of mappings is closed under 
composition of mappings, inverses exists, and it is easy to verify that they form 
a group with the identity mapping ly which takes each element to itself, as 
the group identity element. This group is called the symmetric group on X and 
is denoted Sym(X). If |X| = n it is well known from elementary counting 
principles that there are exactly n! permutations of X. In this case one writes 
Sym(n) for Sym(X), since the names or identities of the elements of X do not 
really affect the nature of the group of all permutations. 

The group of rigid motions of a (regular) cube. Just imagine a wooden cube on 
the desk before you. We consider the ways that cube can be rotated so that it 
achieves a position congruent to the original one. The result of doing one rotation 
after another is still some sort of rotation. It should be clear that these rotations 
can have axes which are situated in three different ways with respect to the cube. 
The axis of rotation may pass through the centers of opposite faces, it may pass 
though the midpoints of opposite edges, or it could be passing through opposite 
vertices. 

Let G = (V, E) be asimple graph. This means the edges E are pairwise distinct 
2-subsets of the vertex set V. Two (distinct) vertices are said to be adjacent if 
and only if they are the elements of an edge. Now the group of automorphisms 
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of the graph G is the set of permutations of the set of vertices V which preserve 
the adjacency relation. The group operation is composition of the permutations. 
Let R be any ring with identity element e.° An element u of R is said to be a 
unit if and only there exists a two-sided multiplicative inverse in R—that is, an 
element uv’ such that u’u = e = uu’. (Observe in this case, that wu’ itself must be 
a unit.) Then the set U(R) of all units of the ring R forms a group under ring 
multiplication. Some specific cases: 


(i) 


(ii) 


(iii) 


(av) 


The group U(Z) of units of the ring of integers Z, is the set of numbers 
{+1, —1} under multiplication. Clearly this group behaves just like the cyclic 
group of order 2, one of those introduced in part (d) above. 

Let D = Z@ Zi, where i? = —1, the ring of Gaussian integers {a + 
bila, b € Z} as a multiplicatively and additively closed subset (that is a sub- 
ring) of the ring of all complex numbers. Then the reader may check that 
U(D) is the set {1, +i} under multiplication. This is virtually identical 
with the cyclic group of order four as defined in part (d) of this series of 
examples. 

Let V be a left vector space over a division ring D. Lethom(V, V) be the col- 
lection of all linear transformations f : V — V (viewed as right operators 
on the set of vectors V). This is an additive group under the operation “+” 
where, for all f, g € hom(V, V), (f +g) is the linear transformation which 
takes each vector v to uf + vg. With composition of such transformations 
as “multiplication” the set hom(V, V) becomes a ring.* 

Now the group of units of hom(V, V) would be the set GL(V) of all linear 
transformations t : V — V where f is bijective on the set of vectors (i.e. 
a permutation of the set of vectors. (V is not assumed to be finite or even 
finite-dimensional in this example.) The group GL(V) is called the general 
linear group of V. 

The group GL(n, F). Let F be a field, and let G = GL(n, F) be the set 
of all invertible n x n matrices with entries in F.> This set is closed under 
multiplication and forms a group. In fact it is the group of units of the ring of 
all n x n matrices with respect to ordinary matrix multiplication and matrix 
addition. 


31t is presumed that the reader has met rings before. Not much beyond the usual definitions are 
presumed here. 


4 Actually it is more, for hom(V, V) can be made to have the structure of a vector space, and hence 
an algebra if D possesses an anti-automorphism—e.g. when D is a field. But we can get into this 
later. 


5Recall from your favorite undergraduate linear algebra or matrix theory course that if n is a positive 
integer, then an n-by-n matrix M has a right inverse if and only if it has a left inverse (this is a 
consequence of the equality of row and column ranks). 
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Suppose x is an element of a group G. We set x9 := e, xl := x, x? := xx, and 
pp group 


for each natural number n, inductively define x” := xx"! Tf there exists a natural 
number n such that x” = e, then x is said to have finite order. If n is the smallest 
positive integer such that x” = e then we write o(x) = n, calling n the order of the 
element x. Otherwise, we say that x has infinite order and write o(x) = oo. In the 
latter case, the powers e, x, x?,... of x are all distinct, for if x” = x” forn < m, 
then e = x"x7" = x™x7" = x—" and so x would have finite order. 


Lemma 3.2.2 Let G be a group, and let x € G be an element of finite order n. 


(i) Ifx™ =e, thenn|m. 
(ii) Ifn and m are relatively prime, then o(x") =n. 


Proof We apply the Division Algorithm (Lemma 1.1.1 of Chap. 1) and write m = 
qn +r with O <r <n. From this, one has r = m — qn and so x! = x4" = 
xx~4" = e. By definition of o(x) = n together with 0 < r < n, we must have 
r = 0, ie., that n| m, proving part (i). 

Next, let o(x’”) = k, and so x" = e. By part (i), we infer that n| km. By 
Lemma 1.1.3 we have n| k. Since it is clear that (x””)” = e, we conclude that, k|n as 
well, forcing n = k = o(x"”). 


Elements of order 2 play a special role in finite group theory, and so are given a 
special name: any element of order two is called an involution. 
The order of a group G is the number |G| of elements within it. 


3.2.2 Subgroups 


Let G be a group. For any two subsets X and Y of G, the symbol XY denotes the 
set of all group products xy where x ranges over X and y ranges independently over 
Y. (It is not immediately clear just how many group elements are produced in this 
way, since a single element g might be expressible as such a product in more than 
one way. At least we have |XY| < |X|- |Y|.) 

A second convention is to write X~! for the set of all inverses of elements of 
the subset X. Thus X~! := {x~!|x € X}. This time |X~!| = |X|, since the corre- 
spondence x —> x~! defines a bijection X — X~! (using Lemma 3.2.1, part 3(ii) 
here). 

A subset H of G is said to be a subgroup of G if and only if 


1. HH C H, so that by restriction H admits the group operation of G, and 
2. with respect to this operation, H is itself a group. 


One easily obtains the useful result: 
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Lemma 3.2.3 (Subgroup Criterion) For any subset X of G, the following are equiv- 
alent: 


I. X is a subgroup of G. 
2. X = X~! and XX X. 
a, 5a SX 


A remark on notation Whenever subset X is a subgroup of G, we write X < G 
instead of X C G. This is in keeping with the general practice in abstract algebra of 
writing A < B whenever A is a subobject of the same algebraic species as B. For 
example we write this if A is an R-submodule of the R-module B, and its special 
case: when A is a vector subspace of a vector space B. Usually the context should 
make it clear what species of algebraic object we are talking about. Here it is groups. 


Corollary 3.2.4 /. The set-intersection over any family of subgroups of G is a 
subgroup of G. 

2. If A and B are subgroups of G, then AB is a subgroup of G if and only if 
AB = BA (an equality of sets). 

3. For any subset X of G, the set (X)g of all finite products y y2 ... Yn, n € N where 
the y; range independently over X U X~!, is a subgroup of G, which is contained 
in any subgroup of G which contains subset X. 
Thus we can also write 

(X)g =Axcu<GH 


where the intersection on the right is taken over all subgroups of G which contain 
set X. 


The proof is left to the reader (see Exercise (2) in Sect. 3.7.1). The subgroup (X)G 
(which is often written (X) when the “parent” group G is understood) is called the 
subgroup generated by X. As is evident from the Corollary, it is the smallest subgroup 
in the poset of all subgroups of G which contains set X. 


Example 24 Examples of Subgroups. 


(a) Let (V, E) bea graph with vertex set V and edge set E, and set H := Aut(T), 
the group of automorphisms of the graph I’, as in Example 23, part (h) in the 
previous subsection. Then H is a subgroup of Sym(V), the symmetric group on 
the vertex set. 

Cyclic Subgroups Let x be an element of the group G. The set (x) := {x”|n € Z} 
(where, for negative integers n, we adopt the convention of Lemma 3.2.1 3(iii) 
that x~” = (x~!)”) is clearly a subgroup of G (Corollary 3.2.4). Now one sees 
that the order of the element x is the (group) order of the subgroup (x). 

Let Y be a subset of X. The set of all permutations of X which map the subset 
Y onto itself forms a subgroup of Sym(X) called the stabilizer of Y.If G is any 
subgroup of Sym(X), then the intersection of the stabilizer of Y with G is called 
the stabilizer of Y in G and is denoted Stabg(Y). This is clearly a subgroup 
of G. Thus it makes sense to speak of the stabilizer of a specified vertex in 


(b 
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the automorphism group Aut(I°) of a graph I’. (For sets with a single member, 
the convention is to write Stab(y) instead of Stab({y}). Note also that all these 
definitions apply when G = Sym(X), as well. 

(d) Letus return once more to the example of the group of rotations of a regular cube 
(Example (g) of the previous subsection). We asserted that each non-identity rigid 
motion was a rotation in ordinary Euclidean 3-space about an axis symmetrically 
placed on the cube. 

One can see that the group of rotations about an axis through the center of a square 
face and through the center of its opposite face, is a cyclic group generated by a 
rotation y of order four. There are 6/2 = 3 such axes: in fact, they can be taken to 
be the three pair-wise orthogonal coordinate axes of the surrounding Euclidean 
space. This contributes 6 elements y of order four and 3 elements of order two 
(such as y”). 

Another type of axis extends from the midpoint of one edge to the midpoint of 
an opposite edge. Rotations about such an axis form a subgroup of order two. 
The generating rotation f of order two does not stabilize any face of the cube, 
and so is not any of the involutions (elements of order two) stabilizing any of the 
previous “face-to-face” axes. Since there are twelve edges in six opposite pairs, 
these “edge-to-edge” axes contribute 6 new involutions to the group. 

Finally there are 8/2 = 4 “vertex-to-vertex” axes, the rotations about which form 
cyclic subgroups generated by a rotation of order three. Thus each of these four 
groups contribute two elements of order three. 

Thus the group of rotations of the cube contains | identity element, 6 elements 
of order four, 3 involutions stabilizing a face, 6 involutions not stabilizing a face, 
and 8 elements of order three—a total of 24 elements. 


3.2.3 Cosets, and Lagrange’s Theorem in Finite Groups 


Suppose H is a subgroup of a group G. If x is any element of G, we write Hx for 
the product set H {x} introduced in the last subsection. Such a set Hx is a right coset 
of H in G. If x and y are elements of G and H < G, then y is an element of Hx if 
and only if Hy = Hx. This is any easy exercise. It follows that all the elements of G 
are partitioned into right cosets as 


G = Uxer Hx a disjoint union of right cosets, for appropriate T. 


Here T is merely a set consisting of one element from each right coset. Such a set 
is called a system of right coset representatives of H in G, or sometimes a (right) 
transversal of H in G. 

The components of this partition—that is the sets {Hx|x € T} for any transversal 
T—is denoted G/H. It is just the collection of all right cosets themselves. The 
cardinality of this set is called the index of H in G and is denoted [G : H]. 
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One notes that right multiplication by element x induces a bijection H > Hx. 
Thus all right cosets have the same cardinality. Since they partition all the elements 
of G we have the following: 


Lemma 3.2.5 (Lagrange’s Theorem) 


1. IfH <G, then |G|=[G: H]-|H|. 
2. The order of any subgroup divides the order of the group. 
3. The order of any element of G divides the order of G. 


We conclude with a useful result. 


Lemma 3.2.6 Suppose A and B are subgroups of the finite group G. Then |AB| - 
|AN B| =|A|- |BI. 


Proof Consider the mapping f : A x B — AB, which maps every element (a, b) 
of the Cartesian product to the group product ab. This map is surjective, and the fibre 
f—!(ab) contains all pairs {(ax, x~'b)|x € AM B}. (Note that in order for (ax, x~!b) 
to be in the designated fibre, one must have ax € A, and x'be B, forcing x € A 
and x~! € B-that is,x € ANB.) Thus |A x B| > |AN B|-|AB|. On the other hand, 
if ab = a'b’ for (a, b) and (a’, b') in A x B, thena!a’! = b(b’)~! =x E ANB. 
But then ax = a’, x~'b = D’. So the fibers are no larger than |A M B|. This gives the 
inequality in the other direction. 


3.3 Homomorphisms of Groups 


3.3.1 Definitions and Basic Properties 


Let G and H be groups. A mapping f : G — H iscalled a homomorphism of groups 
if and only if 
f (xy) = f(x) f() for all elements x, y € G. (3.1) 


Here, as was our convention, we have represented the group operation of both abstract 
groups G and H by juxtaposition. Of course in actual practice, the operations may 
already possess some other notation customary for familiar examples. 

For any subset X, we set f(X) := {f(x)|x © X}. In particular, the set f(G) is 
called the homomorphic image of G. 

We have the usual glossary for special properties of homomorphisms. Suppose 
f :G— H isahomomorphism of groups. Then 


1. f is an epimorphism if f is onto—that is, f is a surjection of the underlying sets 
of group elements. Equivalently, f is an epimorphism if and only if f(G) = H. 

2. f is an embedding of groups whenever f is an injection of the underlying set of 
group elements. (It need not be surjective). 
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3. f is an isomorphism of groups if and only if f is a bijection of the underlying set 
of elements—that is, f is both an embedding and an epimorphism. 

4. f is an endomorphism of groups if it is ahomomorphism of G into itself. 

5. f is an automorphism of a group G if itis an isomorphism G —> G of G to itself. 


The following is an elementary exercise. 
Lemma 3.3.1 Suppose f : G — H is a homomorphism of groups. Then 


I. If \g and \y denote the unique identity elements of G and H, respectively, then 
fig) = 1a 

2. For any element x of G, fx) — (fay 

3. The homomorphic image f(G) is a subgroup of H. 


Lemma 3.3.2 Suppose f : G > H andg: H — K are group homomorphisms. 
Then the composition go f : G — K is also a homomorphism of groups. Moreover: 


1. If f and g are epimorphisms, then so is go f. 

2. If f and g are both embeddings of groups, then so is go f, 

3. If f and g are both isomorphisms, then so is g o f, and the inverse mapping 
fi:HoG. 

4. If f and g are both endomorphisms (i.e. G = H = K), then so is go f. 

5. If f and g are both automorphisms of G then so are go f and f~'. 


Thus the set of automorphisms of a group G form a group under composition of auto- 
morphisms. (This is called the automorphism group of G and is denoted Aut(G)). 

Finally we introduce an invariant associated with every homomorphism of groups. 
The kernel of the group homomorphism f : G — # is the set 


ker f := {x € G| f(x) = 1x}. 


The beginning reader should use the subgroup criterion to verify that ker f is a 
subgroup of G. If f(x) = f(y) for elements x and y in G, then xy~! € ker f, or 
equivalently, (ker f)x = (ker f)y as cosets. Thus we see 


Lemma 3.3.3. The group homomorphism f : G — H is an embedding if and only 
if ker f = 1, the identity subgroup of G. 


We shall have more to say about kernels later. 


3.3.2 Automorphisms as Right Operators 


As noted just above, homomorphisms of groups may be composed when the arrange- 
ment of domains and codomains allows this. In that case we wrote (g o f)(x) for 
g(f (x))—that is, f is applied first, then g. 

As remarked in Chap. 1, that notation is not very convenient if composition of 
mappings is to reflect a binary operation on the set of mappings itself. We have 
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finally reached such a case. The automorphisms of a group G themselves form a 
group Aut(G). To represent how the group operation is realized by composition of 
the induced mappings, it is notationally convenient to represent then in “exponential 
notation” and view the composition as a composition of right operators. 


(The exponential convention) If o is an automorphism of a group G, and g € G, 
we rewrite o(g) as g”. This way, passing from the group operation (in Aut(G)) 
to composition of the automorphisms does not entail a reversal in the order of the 
group arguments.°® 


Thus for automorphisms o and 7 of G and any x € G we then have, 


x77 _ (x?)’. 


3.3.3 Examples of Homomorphisms 


Symmetries that are induced by group elements on some object X are a great source 
of examples of group homomorphisms. Where possible in these examples we write 
these as left operators with ordinary composition “o’”—but we will begin to render 
these things in exponential notation here and there, to get used to it. In the next 
chapter on group actions, we will be using the exponential notation uniformly when 


a group acts on anything. 


Example 25, Examples of homomorphisms. 


(a) Suppose there is a bijection between sets X and Y. Then there is an isomorphism 
Sym(X) — Sym(Y). This just amounts to changing the names of the objects 
being permuted. 

(b) Let R* be the multiplicative group of all nonzero real numbers, and let R** 
be the multiplicative group of the positive real numbers. Then the “squaring” 
mapping, which sends each element to its square, defines a homomorphism of 
groups 


a: R* > R*, 
and, by restriction, an embedding 


olp+ : R* > R*. 


The kernel of o is the multiplicative group consisting of the real numbers +1. 


©This is part of a general scheme in which elements of some ‘abstract group’ G (with its own 
multiplication) induce a group of symmetries Aut(X) of some object X so that group multiplication 
is represented by composition of the automorphisms. These are called “group actions” and are 
studied carefully in the next chapter. 
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(c) In any intermediate algebra course (cf. books of Dean or Herstein, for exam- 


(d 


(d 


Ww 


Ww 


ple), one learns that complex conjugation (which sends any complex number 
z =a+bitoz :=a-—bi,a and b real) is an automorphism of the field of com- 
plex numbers. The norm mapping N : C* + R** from the multiplicative group 
of non-zero complex numbers to the multiplicative group of positive real numbers 
is defined by setting N(z) := z - z for each complex number z. Since complex 
conjugation is an automorphism of the commutative multiplicative group C%, it 
follows that the norm mapping N satisfies N(z1z2) = N(z1)N(z2) and hence is 
a homomorphism of groups. The kernel of the homomorphism is the group C; 
of complex numbers of norm 1—the so-called circle group. 

(When one considers that N(a + bi) = a + b?,a,b ER, it is not mysterious 
that the set of integers which are the sum of two perfect squares is closed under 
multiplication.) 

(Part 1.) Now consider the group of rigid rotations of the (regular) cube. There 
are four diagonal axes intersecting the cube from a vertex to its opposite vertex. 
These four axes intersect at the center of the cube, which we take to be the origin 
of Euclidean 3-space. The angle a formed at the origin by the intersection of any 
of these two axes, satisfies cos(@) = +1/3. Let us label these four axes 1, 2, 3, 4 
in any manner. As we rotate the cube to a new congruent position, the four axes 
are permuted among themselves. Thus we have a mapping 


rotations of the cube — permutations of the labeled axes 
which defines a group homomorphism 
rigid rotations of the cube — Sym(4), 


the symmetric group on the four labels of the axes. The kernel would be a group of 
rigid motions which stabilizes each of the four axes. Of course it is conceivable 
that some axes are reversed (sent end-to-end) by such a motion while others 
are fixed point-wise by the same motion. In fact if we had used the three face- 
centered axes, it would be possible to reverse two of the axes while rigidly fixing 
the third. But with these four vertex-centered axes, that is not possible. (Can you 
show why? It has to do with the angle and the rigidity of the motion.) So the 
kernel here is the identity rotation of the cube. Thus we have an embedding of the 
rotations of the cube into Sym(4). But we have seen in the previous subsection 
that both of these groups have order 24. Thus by the “pigeon-hole principle”, 
the homomorphism we have defined is an isomorphism. 

(Part 2.) Again G is the group of rigid rotations of the cube. There are exactly 
three face-centered axes which are at right angles to one another. A 90° rotation 
about one of these three axes fixes it, but transposes the other two. Thus if we 
label the three face-centered axes by the letters {1, 2, 3}, and send each rotation 
in the group G to the permutation of the labels of the three face-centered axes 
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which it induces, we obtain an epimorphism of groups G — Sym(3), or, in view 
of part | of (d), a homomorphism Sym(4) — Sym(3). The kernel is the group 
K which stabilizes each of the three face-centered axes. This group consists of 
the identity element, together with three involutions, each being a 180° rotation 
about one of the three face-centered axes. Multiplication in K is commutative. 

(e) Linear groups to matrix groups. Now let V be a vector space over a field F of 
finite dimension n. We have seen in the previous subsection that the bijective 
linear transformations from V into itself form a group which we called GL(V), 
the general linear group on V. Now fix a basis A = {v1,..., Un} of V. Any 
linear transformation T : V — V , viewed as a right operator of V can be 
represented as a matrix 


AT 4 ‘= (Dij) 
where 
(vj)T = piivy + pi2v2 +--+ + Pinn 


(so that the rows of the matrix depict the fate of the vector v; under T).’ For 
composition of the right operators S and T on V let us write 


v(T *« S) = (v)T)S, ve V, 


so that T « S is simply S o T in the standard notation for composition. Then we 
see that 


AT *S)A= AT) A: AWS) A 


(where chronologically, the notation intends that T is applied first, then S, being 
right operators and “.” denotes ordinary multiplication of matrices.) This way the 
symbolism does not transpose the order of the arguments, so in fact the mapping 


TATA 
defines a group homomorphism 


fa: GL(V) > GL(n, F) 


7Thanks to the analysts’ notation for functions, combined with our habit of reading from left to 
right, many linear algebra books make linear transformations left operators of their vector spaces, 
so that their matrices are then the transpose of those you see here. That is, the columns of their 
matrices record the fates of their basis vectors. However as algebraists are aware, this is actually a 
very awkward procedure when one wants to regard the composition of these transformations as a 
binary operations on any set of such transformations. 


3.3 
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of the group of linear bijections into the group of 1 x n invertible matrices under 
ordinary matrix multiplication. 

What is the kernel? This would be the group of linear transformations which fix 
each basis element, hence every linear combination of them, and hence every 
vector of V. Only the identity mapping can do this, so we see that f4 is an 
isomorphism of groups. 

The determinant homomorphism of matrix groups. The determinant associates 
with each n x n matrix, a scalar which is non-zero if the matrix is invertible. 
That determinants preserve matrix multiplication is not very obvious from the 
formulae expressing it as a certain sum over the elements of a symmetric group.® 
Taking it on faith, for the moment, this would mean that the mapping 


det : GL(n, F) > F*, 


taking each invertible matrix to its determinant is a group homomorphism into 
the multiplicative group of non-zero elements of the ground field F’. The kernel, 
then, is the group of all n x n matrices of determinant 1, which is called the 
special linear group and is denoted SL(n, F). 

Even and odd permutations and the sgn homomorphism. Now consider the sym- 
metric group on n letters. In view of Example (a) above, the symmetric groups 
Sym(X) on finite sets X of size n are all isomorphic to one another, and so are 
given a neutral uniform description: Sym() is the group of all permutations of 
the set of “letters” {1,2,...,}. Subgroups of Sym(7) are called permutation 
groups on n letters. Representing an abstract group as such a group of permuta- 
tions provides an environment for calculating products. Many properties of finite 
groups are in fact proved by such calculations. In general, the way to transport 
arguments with symmetric groups to arbitrary groups G is to exploit homomor- 
phisms G + Sym(n). These are called “group actions” and are fully described 
in the next chapter. 

Now we can imagine that the neutral set of letters Q, := {1,2,...,m} are 
formally a basis A of an n-dimensional vector space over any chosen field F’. 
Then any permutation becomes a permutation of the basis elements of V, which 
extends to a linear transformation T of V, and can then be rendered as a matrix 
AT 4 with respect to the basis A as in Example (f). Thus, by a composition 
of several isomorphisms that we understand, together with their restrictions to 
subgroups, we have obtained an embedding of groups 


Sym(n) > GL(n, F) 


8The multiplicative properties follow easily from a much nicer definition of determinate which will 
emerge from the exterior algebras studied in Chap. 13. 
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which represents each permutation by a matrix possessing exactly one | in each 
row and in each column, all other entries being zero. Such a matrix is called 
aa permutation matrix. 
Now even the usual sum formula for the determinant shows that the determinant 
of a permutation matrix is --1. Now if we accept the thesis of Example (f) just 
above, that the determinant function is in fact multiplicative, we obtain a useful 
group homomorphism: 
sgn: Sym(n) > {+1} 


into the multiplicative group Z2 of numbers +1, which records the determinant 
of each permutation matrix representing a permutation. The kernel of sgn is 
called the alternating group, denoted Alt(n), and its elements are called even 
permutations. All other permutations are called odd permutations. Since sgn is 
a group homomorphism, we see that 


An even permutation times an even permutation is even. 
An odd permutation times an odd permutation is even. 
An even permutation times an odd permutation (in any order) is odd. 


Since the argument developed for this example assumed the thesis of part (f) 
(of this same Example 25)—that the determinant of a product of matrices is the 
product of their determinants—and since that thesis may not be known from first 
principles by some students, we shall give an elementary proof of the existence 
of the sgn homomorphism in Sect. 4.2.2 of the next chapter, without any appeal 
to determinants. 

The automorphism group of a cyclic group of order n. Finally, perhaps, we should 
consider an example of an automorphism group of a group. We consider here, 
the group Aut(Z,,), the group of automorphisms of the cyclic group of order n, 
where n is any natural number. Suppose, then, that C is the additive group of 
integers mod n—that is, the additive group of residue classes modulo n. Thus 
{Lj] := 7 +nZ}, j =1,2,...,n —1,n. The addition rule becomes 


fi] + (7) = + J] or [i + j — n], whichever does not exceed n, 


where | <i < j <n. Then element [1] generates this group. Indeed so does 
[m] if and only if gcd(m,n) = 1. Thus, if f : C — C is an automorphism 
of C it follows than f([1]) = [m] where gcd(m,n) = 1. Moreover, since f 
is a homomorphism, f[k] = [mk]. Thus the automorphism /f is completely 
determined by the natural number m coprime to and less than n. The number of 
such numbers is called the Euler ¢-function, and it’s value at n is denoted d(n). 
Thus $(n) = |Aut(Z,,)|. 


It now follows that Aut(C) is isomorphic to the multiplicative group ®(n) of all 
residues mod n which consist only of numbers which are relatively prime ton? 


° We will obtain a more exact structure of this group when we encounter the Sylow theorems. 
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(i) The inner automorphism group. Let G be a general abstract group and let x be 
a fixed element of G. We define a mapping 


Wy: G—> G, 


called conjugation by x, by the rule a, (g) := x7!'gx, for all g € G. One can 
easily verify that vw. (gh) = vx (g)Wx (A), and as 7, is a bijection, it is an auto- 
morphism of G. Any automorphism of G which is of the form 7, for some x in 
G, is called an inner automorphism of G. 

Now if {x, y, g} C G, one always has 


1 


y (x! gx)y = (yx gay) = Gy)! gay) (3.2) 


which means 


Wy oy = Wry (3.3) 


for all x, y. Thus the set of inner automorphisms is closed under composition of 
morphisms. Setting y = x~! in Eq. (3.3) we have 


ean =a (3.4) 


and so the set of inner automorphisms is also closed under taking inverses. It 
now follows that the set of inner automorphisms of a group G forms a subgroup 
of the full automorphism group Aut(G). We call this subgroup the inner auto- 
morphism group of G, denoted by Inn(G). 

Now Eq. (3.3) would suggest that there is a homomorphism from G to Aut(G) 
except for one thing: the arguments of the ~-morphisms come out in the wrong 
order in the right side of the equation. That is because the operation “o” is 
denoting composition of left operators. 

This reveals the efficacy of using the exponential notation for denoting automor- 
phisms as right operators. We employ the following: 


(Convention of writing conjugates in groups) If a and b are elements of a 
group G, we write 


a~'ba in the exponential form b°. 
In this notation Eq. (3.2) reads as follows: 
gy =g° G3) 


for all {g, x, y} C G. What could be simpler? 
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Then we understand Eq. (3.3) to read 


Vry = Vx Py, (3.6) 


where the juxtaposition on the right hand side of the equation indicates compo- 
sition of right operators—that is the chronological order in which the mappings 
are performed reads from left to right, ~, first and then 2y. 

Now Eq. (3.6) provides us with a group homomorphism: 


w:G— Aut(G) 


taking element x to the inner automorphism ~,, the automorphisms of Aut(G) 
being composed as right operators (as in the exponential convention for isomor- 
phisms on p. 81). 

What is the kernel of the homomorphism 7? This would be the set of all elements 
z € G such that vw, = lg, the identity map on G. Thus this is the set Z(G) of 
elements z of G satisfying any one of the following equivalent conditions: 


(a) wz = 1g, the identity map on G, 
(b) z~!gz = g for all elements g of G, 
(c) zg = gz for all elements g of G. 


The subgroup Z(G) is called the center of G. The identity element is always 
in the center. If Z(G) = G, then multiplication in G is “commutative” —that is 
xy = yx forall (x, y) € G x G. Such a group is said to be commutative, and is 
affixed with the adjective abelian. Thus G is abelian if and only if G = Z(G). 


A Glossary of Terms Expected to be Understood from the Examples 


ONDNFWNY FE 


Re ee 
NF Oo Oo 


. Homomorphism of groups. 

. Epimorphism of groups. 

. Embedding of groups. 

. Isomorphism of groups. 

. Endomorphism of a group. 

. Automorphism of a group. 

. The kernel of a homomorphism, ker f. 

. The automorphism group, Aut(G), of a group G. 
. The inner automorphism group, Inn(G), of a group, G. 
. An inner automorphism. 

. The center of a group, Z(G). 

. Abelian groups. 
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3.4 Factor Groups and the Fundamental Theorems 
of Homomorphisms 


3.4.1 Introduction 


A teacher is sometimes obliged to present to a class a Theorem labeled as some 
sort of “Fundamental Theorem”. More often than not, such a theorem is not quite as 
fundamental as it must have seemed at an earlier time in our history.!° 

At a minimum it would seem that a proposition should be labelled “a fundamental 
theorem” if it has these properties: 


1. It should be used so constantly in the daily life of a scholar of the field, that 
quoting it becomes repetitive. 

2. It’s logical distance from the “first principles” of the field should be short enough 
to bear a short explanation to a puzzled student (that is, the alleged “fundamental 
theorem” should be “teachable’’). 


We are lucky today! The fundamental theorems of homomorphisms of groups 
actually meets both of these criteria. They tell us that the homomorphic images of 
groups, their compositions, and their effects on subgroups, can all be derived from 
an internal study of the groups themselves. 

The custom has been to name these three theorems as the “‘First-’’, “Second-”, and 
“Third Fundamental Theorems of Homomorphisms”. However a perusal of eight 
well-known textbooks in algebra shows this nomenclature to be far from uniform.!! 
So we have tried to sidestep the ambiguity by naming these three very basic Theorems 
in a way related to what these theorems are telling us. 


3.4.2 Normal Subgroups 


The set of all subgroups of a group are permuted among themselves by automor- 
phisms of a group. Explicitly: if o € Aut(G), and K is a subgroup of G, then 
K° := {k°|k € K} is again a subgroup of G.!* Moreover, if H is a subgroup of 
Aut(G), then any subgroup of G left invariant by the elements of H is said to be 


!0 Who on earth decided that the Fundamental Theorem of Algebra should be the fact that the 
Complex Numbers form an algebraically closed field? Who on earth decided that the Fundamental 
Theorem of Geometry should be the fact that an isomorphism between two Desarguesian Projective 
Spaces of sufficient dimension is always induced by a semilinear transformation of the underlying 
vector spaces? 

‘One author’s “First” theorem is another’s “Second”, all three ordinal labels are used by one author 
(Hall) for what another calls the “First” Theorem, and some authors (Michael Artin, for example), 
seeing the problem, wisely declined to assign ordinal numbers beyond the “First”. 

'2Note the convention of regarding automorphisms as right operators whose action is denoted 
exponentially (see p. 81). 
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H-invariant. Naturally we have special cases for special choices of H. A subgroup K 
which is invariant under the full group Aut(G) is said to be a characteristic subgroup 
of G. We write this as K char G. (The center which we previously encountered is 
certainly one of these). 

But let us drop down to a larger class of H-invariant subgroups by letting 
H descend from the full automorphism group of G to the case that H is sim- 
ply the inner automorphism group, Inn(G), encountered in the previous Section, 
Example 25, part (i). A subgroup which is invariant under Inn(G) is said to be 
normal in G or to be anormal subgroup of G. 

Just putting definitions together one has 


Lemma 3.4.1 The following are equivalent conditions for a subgroup K of G: 


1. Uy(K) = K, forall x € G, 
2. x-!Kx = K foreveryx €G, 
3. xK = Kx, for eachx € G. 


(In this Lemma, w, is the inner automorphism induced by the element x: see 
Example 25, part (i) preceding.) 

As anotational convenience, the symbol K <IG will always stand for the assertion: 
“K is anormal subgroup of G” or, equivalently, “K is normal in G”. This is always 
a relation between a subgroup K and some subgroup G which contains it. It is not 
a transitive relation. It is quite possible for a group G to possess subgroups L and 
K, with L < K and K <i G, for which L is not normal in G.'3 Two immediate 
consequences of normality are the following: 


Corollary 3.4.2. Suppose N is anormal subgroup of the group G and suppose H is 
any subgroup of G. Then the following statements hold: 


I. N is normal in any subgroup of G which contains it. 

2. NH is anormal subgroup of H. (As a special case, if H contains N, then N 
is normal in H). 

3. NH = HN is a subgroup of G. 

4. If H is also normal in G, then so is HN. 


The proof is left for the beginning student in Exercise (4) in Sect. 3.7.1 at the end 
of this chapter. 

One should not leave a basic section on normal subgroups without touching on the 
relationship between the normal subgroups and characteristic subgroups. As noted 
above, a normal subgroup of G is simply a subgroup N of G which is invariant under 
all the inner automorphisms of G, while a characteristic subgroup is a subgroup 


'3 In the group G of rigid rotations of a cube (Example 23, part (g) of Sect. 3.2), the 180° rotation 
about one of the three face-centered axes, generates a subgroup L which point-wise fixes its axis 
of rotation, but inverts the other two face-centered axes. This a normal subgroup of the abelian 
subgroup K of all rotations of the cube which leave each of the three face-centered axes invariant. 
Then K is normal in G, being the kernel of the homomorphism of Example 25, Part (d) of Sect. 3.3. 
But clearly L is not normal in G, since otherwise its unique point-wise fixed face-centered axis, 
would also be fixed. But it is not fixed as G/K induces all permutations of Sym(3) on these axes. 
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K of G that is invariant under a// automorphisms of G. Thus every characteristic 
subgroup of G is already normal, but, in general, being characteristic is a much 
stronger condition. 


Theorem 3.4.3 Assume N is a normal subgroup of G. If K is a characteristic 
subgroup of N (i.e. K is invariant under Aut(N)), then K is normal in G. 


One more result: 
Theorem 3.4.4 For any group G, Inn(G) < Aut(G). 


The proofs of these two theorems are left as Exercises 3 and 4 of Sect. 3.7.3 at the 
end of this chapter. 

Almost any subgroup of G that is unique in some respect is a characteristic 
subgroup of G. For example the identity group 1, the whole group G and the center, 
Z(G), are all characteristic subgroups of G. 


3.4.3 Factor Groups 


Let N be a normal subgroup if G. By Part 3. of the above Lemma 3.4.1, one sees 
that the subset xN is in fact the set Nx, for each x € G. But it also asserts that 
Nx - Ny = N(XN)y = NNxy = Nxy as subsets of G. Thus there is actually a 
multiplicative group G/N whose elements are the right cosets of N in G, where 
multiplication is unambiguously given by the rule 


(Nx) - (Ny) = Nxy. (3.7) 


We have a name for this multiplicative group of cosets of a normal subgroup JN. It is 
called the factor group, G/N. Its identity element is the subgroup N itself (certainly 
aright coset N -n, for any n € N) since Nx- N = NNx = Nx. The inverse in G/N of 
the element Nx is the element Nx~!. In fact it is now easy to verify that the mapping 
G — G/N that sends element x to coset Nx is a group homomorphism (see Theorem 
3.4.5, part (i) below). 

Now let f : G — H be a homomorphism of groups. Two special groups asso- 
ciated with a group homomorphism f have been introduced earlier in this chapter: 
the range or image, f (G), and the kernel, ker f. Now if y € ker f, and x € G, we 
see that 


f@ yo) = (f@) 1 FO) FO = (@))1-1-f@)=1eF, 


since f(y) = 1. Thus forall y € ker f, andx € G,x~!yx € ker f soker f is always 
a normal subgroup. What about the corresponding factor group? 
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Theorem 3.4.5 (The Fundamental Theorem of Homomorphisms) 


(i) If N is anormal subgroup of a group G, the mapping G — G/N which maps 
each element x of G to the right coset Nx which contains it, is an epimorphism 
of groups. Its kernel is N. 

(ii) If f : G — HF is a homomorphism of groups, there is an isomorphism 1 : 
f(G) — Giker f) taking each element f (x), x € G, to (ker f)x. Thus every 
homomorphic image f (G) is isomorphic to a factor group of G. 

(iii) In particular, the kernel of the homomorphism is trivial—i.e. ker f = 1—if and 
only if the homomorphism itself is injective. 


Proof (i) That the mapping 7 : G + G/N defined by 7(x) := Nx, is a homomor- 
phism follows from NxNy = Nxy, for all (x, y) € G x G. By definition of G/N 
this map is onto. (The epimorphism 7 : G — G/N is usually called the projec- 
tion homomorphism or sometimes the natural homomorphism onto the factor group 
G/N.) Since the coset N = N - | is the identity element of G/N, the kernel of v is 
precisely the subset {x € G|Nx = N} = {x € G|x € N} = N. Thus kerz7 = N. 

(ii) Now set ker f := N We propose a mapping 7: f(G) — G/N which takes 
an image element f(x) to the coset Nx. Since this recipe is formulated in terms 
of a single element x, we must show that the proposed mapping is well-defined. 
Suppose f(x) = f(y). We wish to show Nx = Ny. But the hypothesis shows that 
farly) = f@ DFO) = (F@)'f@) = 1 € G,sox'y € kerf = N, 
whence Nx = Ny, as desired. 

Now 


MF OFO)) = nf ay) = Nxy (3.8) 
= NxNy = (f(x) nf). (3.9) 


Thus 77 is ahomomorphism. If f(x) were in the kernel of 7, then n(f (x)) = Nx = N, 
the identity element of G/N. Thus x € N, so f(x) = f (1), the identity element of 
f(G). Finally, 7 is onto, since, for any g € G, the coset Ng = n(f (g)). Thus 77 is a 
bijective homomorphism and so is an isomorphism. 

(iii) This is obvious from first principles since xy~! € ker f if and only if f(x) = 
Ff (y). It also follows from (ii). The proof is complete. 


Remark (a) The main idea about (i) is that when you spot a homomorphism shooting 
off somewhere, you do not have to search all over the Universe to study it. Instead, 
you can realize the homomorphic image right inside the structure of the group G 
itself, as one of its factor groups. 

(b) We have included statement (iii) in the Fundamental Theorem of Homomor- 
phisms since it is so often implicitly used without any particular reference. Since it 
is an immediate consequence of part (11) we have given it a home here. 


There are two further consequences of Theorem 3.4.5 which are contained in the 
following Corollary. 
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Corollary 3.4.6 (The classical isomorphism theorems) 


(i) (The Composition Theorem for Groups) Suppose K and N are both normal 
subgroups of G, with K contained in N. Then N/K is a normal subgroup of 
G/K, and G/N is isomorphic to the factor group (G/K )/(N/K ). 

(ii) (The Modularity Theorem for Groups) /f N is anormal subgroup of G, and H 
is any subgroup of G, HN/N is isomorphic to H/(H MN). 


Proof (i) By Corollary 3.4.2, part 1, K is normal in N, so N/K is the multiplicative 
group of cosets {Kn|n € N}. For any coset Kx of G/K, and coset Kn of N/K, 
(Kx)~!KnKx = K(x~'nx), which is in N/K. Thus N/K < G/K. Thus there is a 
natural epimorphism f2 : G/K — (G/K)/(N/K) onto the factor group as described 
in the Fundamental Theorem (Theorem 3.4.5, part (i)). By the same token, there 
is a canonical epimorphism f; : G — G/K. By Lemma 3.3.2 of Sect.3.2. The 
composition of these epimorphisms is again an epimorphism: 


fr.° fi: G > (G/K)/(N/K). 


An element x of G is first mapped to the coset Kx and then to the coset (Kx){Kn|n € 
N} (which, viewed as a set of elements of G is just Nx). Thus x maps to the identity 
element (N/K)/(N/K) of (G/K)/(N/K) if and only if x € N. Thus the kernel of 
epimorphism f2 0 f; is N. The result now follows from the Fundamental Theorem 
of Homomorphisms (Theorem 3.4.5 Part (i1)). 

(ii) Clearly N I HNand N 1 Hs <i A by Corollary 3.4.2, partl. We propose 
to define a morphism f : H — HN/N by sending element h of H to coset Nh € 
HN/N. Clearly hh’ is sent to Nhh’ = (Nh)(Nh’), so f is a group homomorphism. 
Since every coset of N in HN = NH has the form Nh for some h € H, f is an 
epimorphism of groups. Now if h € H, then Nh = N, the identity of HN/N, if and 
only ifh ¢ HON. Thus ker f = HN and now the conclusion follows from the 
Fundamental Theorem of Homomorphisms (Theorem 3.4.5 Part (ii)). 


3.4.4 Normalizers and Centralizers 


Let X be some non-empty subset of a group G. For each x € G, the set 7), (X) := 
x—!X-x is called a conjugate (in G) of the set X. Given X, consider the set 


Ng(X) := {x € Gix7! Xx = X}. 


The reader may verify that the set on the right side of the presented equation indeed 
satisfies any one of the equivalent conditions listed in the Subgroup Criterion (Lemma 
3.2.3). The subgroup described by this equation is called the normalizer (in G) of 
the set X and is denoted Ng(X). Also, if H is any subgroup of G, we say that H 
normalizes the set X if and only if H is a subgroup of the normalizer Nc(X). 

In practice, X is often itself a subgroup. 
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Lemma 3.4.7 The following statements about a group G are true. 


1. The subgroup H is anormal subgroup of G, i.e. H <1 G, ifand only if Ng(H) = G. 

More generally, H is normal in the subgroup K—i.e. H <1 K—if and only if 

H<K <NG(A). 

If H and K are subgroups of G, then Nx(H) = NG(H) K. 

3. If H and K are subgroups of G, and K normalizes H, then HK = KH, and KH 
is a subgroup of G. 

4. Suppose the subgroup K normalizes the subset X of G. Then K also normalizes 
the subgroup (X)G generated by X. 


N 


The statements are immediate consequences of the definitions. The beginning 
student is urged to warm up some nice fall afternoon by devising formal proofs of 
these statements. 

There is another kind of subgroup determined by a subset X of a group G, namely 
the subgroup 

Cgo(X) := {9 € Glg7!xg =x, for all x € X}, 


called the centralizer (in G) of X. It consists precisely of those elements in G which 
commute with every element of X. We say that subgroup H centralizes the set X if 
and only if H C Cg(X). 

At this point it might be useful to compare the centralizer and the normalizer. The 
normalizer NG (X) is the set of elements g € G whose associated inner automorphism 
Wg leaves the subset X invariant as a whole. The centralizer Cg (X) consists of those 
elements g € G whose associated inner automorphism wy fixes set X element-wise. 
Now, as with the normalizer, we possess a number of elementary observations. 


Lemma 3.4.8 Suppose G is some fixed group with a designated subset X. The 
following statements hold. 


I. We have Cg(X) J Ng (X). 

2. If H is a subgroup of G, then Cy(X) = Cg(X) 1 A. 

3. Ifsubgroup H centralizes X, then it also centralizes the subgroup (X)G generated 
by X. 


Once again the beginning student is invited to spend a few moments some nice 
fall afternoon assembling formal proofs of the statements in Lemma 3.4.8 This time, 
in view of part 1, the student is permitted to order a drink. 


3.5 Direct Products 


A direct product is a formal construction for getting new groups in a rather easy way. 
First let G = {G,|o € I} be a family of groups indexed by the index set 7. We can 
write elements of the Cartesian product of the G, as functions 
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fil-> U,.)G such that f(o) € Go. 


Of course, when J is countable, we can represent elements f in the usual way, as 
sequences (f(1), f(2), ...). Define a binary operation on the Cartesian product by 
declaring the “product” f| f2 of f; and f2 to be the function whose value (/| f2)(a) at 
oa € Lis f\(c) f2(7)—that is, the values f;(c) at the “coordinate” o are multiplied in 
the group G,) to yield the o-coordinate of the “product”. This is termed “coordinate- 
wise multiplication”, since for sequences, the product of (a1, a2, ...) and (b1, bo, ...) 
is (a,b1, azb2, ...), by this definition. The Cartesian product with this coordinate- 
wise multiplication clearly forms a group, which we call the direct product over G 
and is denoted by 


[G0 o [],G 


or, when J = {1,2,...,}, simply by 
G, X G2 X ++: X Gy. 


It contains a subgroup consisting of all maps f in the definition of direct product, 
for which f(c) fails to be the identity element of the group G, only finitely many 
times. This subgroup is called the weak direct product or direct sum over G and is 
denoted 


B,., (Ga) 1 Be Ga). 


When / is finite, there is no distinction between the direct product and direct sum. 

Consider the group G = {+1} under ordinary multiplication of integers. Thus 
G is a group of order 2, with involution —1. Then G x G is the group of pairs 
(u,v), u, v € {£1}, with coordinate-wise multiplication. For example, one calculates 
that (—1, 1)-d1, —1) = (-1, —1). In this case G x G is a group of order four with an 
identity element and three involutions. The product of any two distinct involutions 
is the third involution. Any group in the isomorphism class of this group is called a 
fours group. 

For every index 7 in /, there is clearly a homomorphism 7; : Og(Ga) > G, 
which takes f to f(7), for all f in the direct product. Such an epimorphism is called 
a projection onto the coordinate indexed by T. This epimorphism retains this name 
even when it is restricted to the direct sum. 

Any permutation of the index set induces an obvious isomorphism of direct prod- 
ucts and direct sums. Thus G; x G2 is isomorphic to G2 x G; even though they are 
not formally the same group. 

Similarly, any (legitimate) rearrangement of parentheses involved in constructing 
direct products yields isomorphisms; specifically 


G X G2 x G3 & (G, X G2) xX G3 ~ G x (Go x G3). 
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Now suppose A and B are two subgroups of a group G with AN B = 1, and B 
normalizing A (that is, B < NG(A)). Then together they generate a subgroup AB 
with order |A| - |B| (Lemma 3.2.6). Now suppose in addition that A normalizes B. 
Then for any element a € A and element b € B, the factorizations (aba~')b7! — 
a(ba~'b~!) show that aba~'!b~! € AN B = {1} and so ab = ba. Thus all elements 
of A commute with all elements of B. In that case the mapping A x B > AB 
which takes (a,b) € A x B to ab, is a group homomorphism whose kernel is 
{(x, x7!)|x € AN B}. Since AN B = {1}, this map is an isomorphism. 

These remarks are summarized in the next Lemma. 


Lemma 3.5.1 


(i) For any permutation 7 in Sym(n), There is an isomorphism 
Gi x Go Yee x Gy > Grrl) x G2) Keon x Grn) 


although neither of the groups are necessarily formally the same. 

(ii) Any two well-formed groupings of the factors of a direct product into paren- 
theses yields groups which are isomorphic to the original direct product and 
hence are isomorphic to each other. 

(iii) (Internal Direct Products) Suppose A,, A2,... is a countable sequence of 
subgroups of a group G. 


(a) Suppose Aj; normalizes Aj whenever i # j, and 
(b) AyAz-++Ag-1 OM Ag = 1, for all k > 2. 


Then A, A2--- A, is isomorphic to the direct product A, x --- X Ay for any 
finite initial segment {A,,..., An} of the sequence of subgroups. 


3.6 Other Variations: Semidirect Products and Subdirect 
Products 


3.6.1 Semidirect Products 


Suppose we are given a homomorphism 
f:H— Aut(), 


for two abstract groups N and H. Then f defines a formal construction of a group 
N:H, which is called a semidirect product of N by H.'* The elements of N:H are 


'4Tn some older books there is other notation for the semidirect product—some sort of decoration 
added to the “<1” symbol. The notation N:H is the notation for a “split extension of groups” used 
in the Atlas of Finite Groups [12]. 
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the elements of the Cartesian product N x H. Multiplication proceeds according to 
this rule: For any two elements (11, 41) and (m2, h2) in N x H: 


-1 
(ny, hy) (no, ho) = (ni (n2)F™ ?, hyhg). 


Verification of the associative law for a triple product (11, h1)(m2, h2)(n3, h3) comes 
down to the calculation 


nf(eaky) — (nfO2 F057), 


Example 26 1. If f : H — Aut(N) is the trivial homomorphism—that is, f maps 
every element of H to the identity automorphism of N—then N:H is just the 
ordinary direct product N x H of the previous section. 

2. If A and B are subgroups of a group G for which A normalizes B and AN B = 1, 
then the subgroup AB of G, is isomorphic to the semidirect product B:A with 
respect to the morphism f : A — Aut(B) which takes each element a of A to 
the automorphism of B induced by conjugating the elements of B by a—that is 
WalB- 

3. Let N be the additive group of integers mod n, and let H be any subgroup of 
the multiplicative group of residues coprime to n (for example, the quadratic 
residues coprime to 1). Let G be the group of all permutations of N of the form 


pm, h):x > hx +m, forallx € N. 


Then G is the semidirect product N:H. 

4. Let F be any field, let F* be the multiplicative group of all nonzero elements of 
F,, and let H be any subgroup of F*. Then the set of all transformations of F of 
the form 


x—>hx+a,aceF,heH, 


forms a group under the composition of such transformations. This group is 
isomorphic to the semidirect product (F’, +):H, where (F, +) denotes the group 
on F whose operation is addition in the field F. 

5. (The Frobenius group of order 21.) The group is generated by elements x and y 
subject only to the relations 


x=1y= 1x yx = yrs 


It is a semidirect product Z7 : Z3. (This is an example of a “presented group”, 
which we shall meet in Chap. 6.) Here, the presentation reveals the homomor- 
phism H = (x) — Aut((y). This group is isomorphic to a group constructed 
by the recipe in part 4, where N is the additive group of the field Z/(7) and H 
is the multiplicative group of quadratic residues mod 7. 
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Suppose A is an abelian group of order n which is not a direct product of Z’s. 
Then the mapping o : A > A defined by a? = a7! for alla € A, is an 
automorphism of order 2. Using the inclusion (o) C Aut(A) for f, one can 
form the semidirect product A(c) as above. Some authors refer to A(co) as a 
generalized dihedral group. 


. A group is said to be a normal extension of a group N by H (written G = H.N) 


if and only if 
N<GandG/N ~ H. 


The extension is said to be split if and only if there is a subgroup H of G such 
that 


G=NA,, NO A, =1. 


In this case, H; ~ H. One can see that 


G is a split extension of G by H if and only if G is a semidirect product of N 
by H. 


First, if the extension is split, every element of G has the form g = nh, (n, h) € 
N x H. Because NM Hy = 1, the expression g = nh is unique for a given 
element g. We thus have a bijection 


0:G>NxH 
which takes g = nh to (n, T(h)), where 7 is the isomorphism H; — H = G/N 


(from Corollary 3.4.6, part (ii)). Now if we multiply two elements of G, we 
typically obtain 


-1 
(nyhi)(n2h2) = ny (Anh !)hyhz = ny (n2) ) - hyho, 


where, by our convention on inner automorphisms, g* := x~!gx. The o-value 
of this element is the pair 


((ny)(n2), hy) 7g). 


Thus o is an isomorphism of G with the semidirect product of N by H, defined 
by composing 7~! with the homomorphism 


A, — Aut(N) 


induced by conjugation. (The latter is just the restriction to H; of the homomor- 
phism ~ of Example 3, part (i).) 

Conversely, if G is the semidirect product of N by H defined by some homo- 
morphism p: H — Aut(N), then, setting 
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No = {(n, Din € M} 
Ho :={(,A)|h € A} 


then 
No < G = NoHo and NoN Ho = 1, 


so G is the split extension No by Ho and conjugation in G by element ho of Ho 
induces automorphism p(hg) on No. (Clearly, No ~ N and Ho ~ H.) 


3.6.2 Subdirect Products 


Something a little less precise is the notion of a subdirect product. We say that a 
group H/ is a subdirect product of the groups {G,|o € I} if and only if (i) H isa 
subgroup of the direct product [],G, and (ii) for each index 7, the restriction of 
the projection map 7, to H is onto—i.e. 7;(H) = G,. There may be many such 
subgroups, so the isomorphism type of H is not uniquely determined by (1) and (ii) 
above. 

Subdirect products naturally arise in the following situation: Suppose M and N 
are normal subgroups of a group G. Then G/(M 1 N) is a subdirect product of 
G/N and G/M. This is because G/(M M N) can be embedded as a subgroup of 
(G/M) x (G/N) by mapping each coset (N M M)x to the ordered pair (Mx, Nx). 

In fact something very general happens: 

If {N,|o € I} is a family of normal subgroups of a group G, then 


e G/(N,N,) is the subdirect product of the groups {G/N,|o € J}. 

e Suppose F is a family of groups “closed under taking subdirect products”—that is, 
any subdirect product of members of F is in F. (For example the class of abelian 
groups is closed under subdirect products.) Then for any arbitrary group G, there 
is a “smallest” normal subgroup GF whose associated factor group G/G lies in 
F. Precisely, G/G¢ € F and if N is normal in G with G/N ¢€ F, then Gz < N. 


A simple group is a group G whose only proper normal subgroup is the identity 
subgroup. (Note that by this definition, the group of order one is not a simple group.) 
A maximal normal subgroup of a group G is a maximal element in the partially 
ordered set of proper normal subgroups of G (ordered by inclusion, of course). 
In Exercise (1) in Sect. 3.7.2 the following is proved: 


Lemma 3.6.1 A factor group G/N of G is a simple group if and only if N is a 
maximal normal subgroup of G. 


We begin with an elementary Lemma. 
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Lemma 3.6.2. Suppose {M;|i € I} is a finite collection of maximal normal sub- 
groups of a group G. Then for some subset J of I, 


G/(Oier Mi) = G/(NjesMj) ~ |]. (G/M) 


a direct product of simple groups. 


Proof The result is true for |7| = 1, as G/M, is simple. We use induction on 
||. Renumbering the subscripts if necessary, the induction hypothesis allows us to 
assume that for some subset J = {1,...,d} Cl’ ={1,...,k — I}, 


N 2 =NijerM;i =OjerM;. (3.10) 
G/N =~ (G/M\) x --- x (G/Mgq). (3.11) 


We set J := {1,...,k} = I’ + {k} and assume, without any loss of generality, 
that M; is not contained in N. Then since N M; is anormal subgroup of G properly 
containing M;, we must have G = NMx,. Also, since My is asimple group, NAM = 
1. Then 

G/(N 0 My) = (G/N) x (G/Mx), 


and the result for || = k follows upon substitution for G/N in the right-hand side. 


One may conclude. 


Corollary 3.6.3 


(i) Suppose G is a finite group with no non-trivial proper characteristic subgroups. 
Then G is a direct product of pairwise isomorphic simple groups. 

(ii) If N isaminimal normal subgroup of a finite group G, then N is a direct product 
of isomorphic simple groups. 


Proof Part (i) Let M be a maximal normal subgroup of the finite group G. Then G 
has a finite automorphism group, and so {M°|a € Aut(G)} is a finite collection of 
maximal normal subgroups of G whose intersection N is a characteristic subgroup of 
G properly contained in G. By hypothesis, N = |. Then by the above Lemma 3.6.2, 
G = G/N is the direct product of simple groups. If G = 1, we have an empty direct 
product and there is nothing to prove. Otherwise, the product of those direct factors 
isomorphic to the first direct factor clearly form a non-trivial characteristic subgroup, 
which, by hypothesis must be the whole group. 

Part (ii) Since N is a minimal normal subgroup of the finite group G, N is a 
non-trivial finite group with only the identity subgroup as a proper characteristic 
subgroup, so the conclusion of Part (i) holds for NV. 
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3.7 Exercises 


3.7.1 Exercises for Sects. 3.1-3.3 


1. Write out formal proofs of Lemmas 3.3.1 and 3.3.2 (all parts). 

2. Prove Corollary 3.2.4. 

3. Prove that G/Z(G) can never be a cyclic group. [Hint: If false, cosets of the form 
Z(G)x** partition G, and elements on one of these cosets commute with those 
in any other such coset. So G can be shown to be abelian. ] 

4. Prove Corollary 3.4.2. [Hint: Use the subgroup criterion (Lemma 3.2.3) for part 


3.] 


5. Suppose a group G has exponent 2—-that is, g? = 1 for every g € G. Show that 
G is abelian. 
6. Suppose p is a (positive) prime number and k is a positive integer. 


Gi) If p > 2, show that Aut(Z pi) ~ Zp-1 X Z pk-1. 
(ii) If p = 2 show that Aut(Z5.) ~ Zo x Zp-2. 


3.7.2 Exercises for Sect. 3.4 


1. Suppose AN is a normal subgroup of the group G. 


(a) Show that there is an isomorphism between the poset of all subgroups H 


(b 


(c 


we 


wm 


of G which contain N, and the poset of all subgroups of G/N (both posets 
partially ordered by the containment relation). 

[Hint: The isomorphism takes H in the first poset to H/N in the second.] 
Show that H is normal in G if and only if H/N is normal in G/N. Thus the 
isomorphism of part (a) and its inverse both preserve the normality relation. 
(Note that they need not preserve the property of being characteristic in 
either direction. The group H containing N is called the inverse image of 
HIN.) 

Using the fundamental theorem of homomorphisms conclude that if 


f:GoL 


is an epimorphism of groups, then there is a 1-1 correspondence of the 
subgroups of L with the subgroups of G containing ker f preserving con- 
tainment and normality. The correspondence takes a subgroup L; of L to 
the subset {g € G| f(g) € Li}, also called the inverse image of Ly. 


2. If X is a subset of the group G (that is, the subset X is not necessarily a subgroup 
of G), define the centralizer in G of X to be the set of all group elements g of 
G such that gx = xg for all x € X. This set is denoted Cg(X). Similarly, the 
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normalizer in G of X is the set of elements {g € G|g~!Xg = X}. Such elements 
may permute the elements of X by conjugation, but they stabilize X as a set. A 
subgroup H is said to normalize X if and only if H < Ng(X). If Ng(X) = G, 
then X is called a normal set in G. 


(a) Show that Cg(X) and Ng(X) are subgroups of G. 

(b) Show that if X is a normal set in G, then Cg(X) is also normal in G. 

(c) Show that if NV is a normal subgroup of G, then the inner automorphism 
Wg ix > g ‘xg induces an automorphism of N. 


3. (a) Conclude that a characteristic subgroup of a normal subgroup of G is normal 
in G, that is, K char N  Gimplies K IG. 
(b) Conclude that a characteristic subgroup of a characteristic subgroup of G is 
characteristic in G, that is, L char K char G implies L char G. 
(c) Suppose N is a normal subgroup of G. Show that the mapping which sends 
element g to q,|y,, the restriction of the inner automorphism conjugation-by- 
g to N, defines a group homomorphism 


wv’ : G = Aut(N) 


whose kernel is Cg(NV). Conclude that the group of automorphisms induced 
on N by the inner automorphisms of G is isomorphic to G/Cg(N). 


4. Show that for any group G, Inn(G) < Aut(G). (The factor group Out(G) := 
Aut(G)/Inn(G) is called the outer automorphism group of G.) 
. Prove the assertions of Corollary 3.4.2. 
6. Recall from p. 99 that a group is said to be a simple group if and only if its only 
proper normal subgroup is the identity subgroup. (Note that the definition forbids 
the identity group to be a simple group.) 


Nn 


(a) Prove that any group of prime order is a simple group. 

(b) Suppose G is a group. A subgroup M is a maximal subgroup of G if and 
only if it is maximal in the poset of all proper subgroups of G. A subgroup 
M is a maximal normal subgroup of G if and only if it is maximal in the 
poset of all proper normal subgroups of G. (Note that it does not mean that 
it is amaximal subgroup which is normal. A maximal normal subgroup may 
very well not be a maximal subgroup.) Prove that a factor group G/N of G 
is a simple group if and only if N is a maximal normal subgroup. 
[Hint: Use the result of Exercise (1)b in Sect. 3.7.2 just above.] 


3.7.3 Exercises for Sects. 3.5—3.7 


1. Let V be any vector space, and form the group GL(V) of all invertible linear 
transformations of V. Let G be any group and let f : G — GL(V) be a homo- 
morphism of groups. [Such a homomorphism is said to be a representation of the 
group G.] Define multiplication on V x G by the rule 
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(v1, 91) + (v2, 92) = 1 t+ of™ , gign). 


(a) Show that this group is isomorphic to the semidirect product (V, +):G upon 
noting that GL(V) < Aut(V, +). 
(b) Show that the center of (V, +):G is the direct product Cg(V) x kerf. 


2. A popular textbook in intermediate algebra (after presenting the Sylow theorems 

which will appear in the next chapter) offers an exercise requesting that the reader 
prove that every group of order 75 is abelian. Using the semidirect product con- 
struction, show that there exists a non-abelian group of order 75. 
[Hint: Let V := Zs5 x Zs5. One can regard V as a vector space over the field 
Z/(5) of integers mod 5. Let t : V — V bea linear transformation acting with 
minimal polynomial x* +x +1. This means V has a basis {v,, v2} with vj = 02, 
and v5 = —v1 — v2. Then f? induces the identity transformation ly. Thus (rf), 
as a subgroup of GL(V), has order 3. One can then form the semidirect product 
(V, +)(t) as described in the previous exercise. ] 

3. Suppose K is a characteristic subgroup of N and N is a normal subgroup of G. 
Show that K is a normal subgroup of G. 

[Hint: Conjugation by any element of G induces an automorphism of N and so 
leaves K invariant, since K is characteristic in N.] 

4. Show that the group of inner automorphism of G is a normal subgroup of the 
group of all automorphisms of G. 

[Hint: Let 7)(g) be conjugation by g. show that 


a! -w(g)-o 


is conjugation by o~!(g) and so is an inner automorphism. ] 


Chapter 4 
Permutation Groups and Group Actions 


Abstract A useful paradigm for discussing a group is to regard it as acting as a group 
of permutations of some set. The power of this point of view derives from the flexi- 
bility one has in choosing the set being acted on. Odd and even finitary permutations, 
the cycle notation, orbits, the basic relation between transitive actions and actions 
on cosets of a subgroup are first reviewed. For finite groups, the paradigm produces 
Sylow’s theorem, the Burnside transfer and fusion theorems, and the calculations of 
the order of any group of automorphisms of a finite object. Of more special interest 
are primitive and multiply transitive groups. 


4.1 Group Actions and Orbits 


As defined earlier, a permutation of a set X is a bijective mapping X —> X. Because 
we will be dealing with groups of permutations it will be convenient to use the 
“exponential” notation which casts permutations in the role of right operators. Thus if 
7 is a permutation of X, we write x” for the image of element x under the permutation 
a. Recall that composition of bijections and inverses of bijections are bijections, so 
that the set of all bijections of set X into itself form a group under composition which 
we have called the symmetric group on X and have denoted Sym(X). Recall also 
that any bijection v : X — Y induces a group isomorphism v : Sym(X) — Sym(Y) 
by the rule: 
Wx) = Ua)". 


In order to emphasize the distinction between the elements of Sym(X) and the 
elements of the set X which are being permuted, we often refer to the elements 
of X by the neutral name “letters”. In view of the isomorphism just recorded, the 
symmetric group on any finite set of n elements can be thought of as permuting the 
set of symbols (or ‘letters’) Q, := {1,2,...,}. This group is denoted Sym(m) and 
has order n!. 

Now suppose H is any subgroup of Sym(X). Say for the moment that two elements 
of X are H-related if and only if there exists an element of H which takes one to 
the other. It is easy to see that “H-relatedness” is an equivalence relation on the 
elements of X. The equivalence classes with respect to this relation are called the 
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H-orbits of X. Within an H-orbit, it is possible to move from any one letter to any 
other, by means of an element of H. Thus X is partitioned into H-orbits. We say H 
is transitive on X if and only if X comprises a single H-orbit. 

We now extend these notions in a very useful way. We say that a group G acts on 
set X if and only if there is a group homomorphism 


f:G— Sym(X). 


We refer to both f and the image f(G) as the action of G, and we shall borrow 
almost any adjective of a subgroup f(G) of Sym(X), and apply it to G. Thus we 
call an f(G)-orbit, a G-orbit, we say “G acts in k orbits on X” if X partitions into 
exactly k such f(G)-orbits, and we say “G is transitive on X”’, or “acts transitively 
on X” if and only if f(G) is transitive on X. 

If the action f is understood, we write x9 instead of x/“, Also the unique G-orbit 
containing letter x is the set {x9|g € G}, and is denoted x°. Its cardinality is called 
the length of the orbit. The group action is said to be faithful if and only if the kernel 
of the homomorphism f : G — Sym(X) is the identity subgroup of G—that is, 
f is an embedding of G as a subgroup of the symmetric group. 

The power of group actions derives from the fact that the same group can have 
several actions, each of which can yield new information about a group. We have 
already met several examples of group actions, although not in this current language. 
We list a few. 


4.1.1 Examples of Group Actions 


1. Recall that each rigid rotation of a regular cube induces a permutation on the 
following sets: 


(a) the four vertex-centered axes. 
(b) the three face-centered axes. 
(c) the six edge-centered axes. 
(d) the six faces. 

(e) the eight vertices. 

(f) the twelve edges. 


If Y is any one of the six sets just listed, and G is the full group of rigid rotations 
of the cube, then we obtain a transitive action fy : G — Sym(Y). Except for 
the case Y is the three perpendicular face-centered axes (Case (b)), the action is 
faithful. The action is an epimorphism onto the symmetric group only in Cases 
(a) and (b). 
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2. If N is a normal subgroup of G, then G acts on the elements of N by inner 
automorphisms of G. Thus n’ := g~!ng for each element g of G and each 
element n of N, to give the group action: f : G — Sym(JN). Clearly the identity 
element of G forms one of the G-orbits and has length one. 

If N = G, the G-orbits x% := {g7!xglg € G},x € G are called conjugacy 
classes of G. Now one may consider the transitive action G > Sym(x®) ona 
single conjugacy class x°. 

3. The image H° of any subgroup H of G under an automorphism o is again a 
subgroup of G, and this is certainly true of inner automorphisms. Thus H9 := 
g_ | Hg is asubgroup of G called a G-conjugate of H or just conjugate of H. Thus 
there is an action G — Sym(S(G)) where S(G) is the collection of all subgroups 
of G. As in the previous case, this restricts to the transitive action G > Sym(H®) 
on the set H° := {H9|g € G} of all conjugates of the subgroup H. 

4. Finally we can take any fixed subset X of elements of G and watch the action 
G — Sym(X®) on the collection X° := {g~!Xg|g € G} of all G-conjugates of 
set X. 

5. Let Py = Px(V) be the collection of all k-dimensional subspaces of the vector 
space V, where k is a natural number. We assume dim V > k to avoid the pos- 
sibility that P, is empty. Since an invertible linear transformation must preserve 
the dimension of any finite dimensional subspace, we obtain an action 


f : GL(V) > Sym(Px). 


It is necessarily transitive. 
6. Similarly, if we have an action f : G + Sym(X), where |X| > k, we also inherit 
an action fy : G — Sym(Y) on the following sets Y: 


(a) The set X of ordered k-tuples of elements of X. 
(b) the set X“* of ordered k-tuples with pairwise distinct entries. 
(c) the set X (k) of all (unordered) subsets of X of size k. 


4.2 Permutations 


A permutation 7 € Sym(X) is said to displace letter x € X if and only if x7 4 x. 
A permutation is said to be finitary if and only if it displaces only a finite number 
of letters. The products among, and inverses and conjugates of finitary permutations 
are always finitary, and so the set of all finitary permutations always form a normal 
subgroup FinSym(X) of the symmetric group Sym(X). Any finitary permutation 
obviously has finite order. 

In this subsection we shall study certain arithmetic properties of group actions 
which are mostly useful for finite groups. 
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4.2.1 Cycle Notation 


Now suppose 7 is a finitary permutation in Sym(X). Letting H be the cyclic subgroup 
of Sym(X) generated by 7, we may partition X into H-orbits. By our self-imposed 
finitary hypothesis, there are only finitely many H-orbits of length greater than one. 
Let O be one of these, say of length n. Then for any letter x in O, we see that 
O = x must consist of the set ie aes oe x™, = axe }. Now x™" must be a letter in 
this sequence which, by the injectivity of 7, can only be x itself. We represent this 
permutation of O by the symbol: 


n—-1 


2 
(ee ee OD) 


called a cycle. Stated in this generality, the notation is not very impressive. But in 
specific instances it is quite useful. For example the notation (1 2 4 7 6 3) denotes a 
permutation which takes | to 2, takes 2 to 4, takes 4 to 7, takes 7 to 6, takes 6 to 3, and 
takes 3 back to 1. Thus everyone is moved forward one position along a circular trail 
indicated by the cycle. In particular the cycle notation is not unique since you can 
begin anywhere in the cycle: thus (7 6 3 1 2 4) represents the same permutation just 
given. Moreover, writing the cycle with the reverse orientation of the circuit yields 
the inverse permutation, (4 2 | 3 6 7) in this example. 

Now the generator 7 of the cyclic group H = (7) acts on each H-orbit as a cycle. 
This can be restated as the assertion that any finitary permutation can be represented 
as a product of disjoint cycles, that is, cycles which pairwise displace no common 
letter. Such cycles commute with one another so it doesn’t matter in which order 
the cycles are written. Also, if the set X is known, the permutation is determined 
only by its list of disjoint cycles of length greater than one—the cycles of length 
one (indicating letters that are fixed) need not be mentioned. Thus in Sym(9), the 
following permutations are the same: 


(14597) (26) = (62)(14597)(8). 
Now to multiply two such permutations we simply compose the two permuta- 
tions—and here the order of the factors does matter. (It is absolutely necessary that 


the student learn to compute these compositions.) Suppose 


mT = (761)(3 4 8)(25 9) (4.1) 
o = (7, 10, 11)(9 24). (4.2) 


Then, applying 7 first and o second, the composition o o 7 of these right operators is 


mo = (1, 10, 11,7 6)(25)G 948). 
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We simply compute (7c)-orbits.! The computation begins with an involved letter— 
say “1”—and asks what zo did to it? 1” = 7 and 77 = 10, so 177 = 10. Then it 
asks what ro did to 10 in turn? One gets 1077 = 11, 117” = 7, and so on, until the 
(7a)-orbit is completed upon returning to 1. One then looks for a new letter displaced 
by at least one of the two permutations, but which is not involved in the orbit just 
calculated—say, “2”—and one then repeats the process. 

We make this observation: 


Lemma 4.2.1 The order of a finitary permutation expressed as a product of disjoint 
cycles is the least common multiple of the lengths of those cycles. 


Proof Suppose x and y are disjoint cycles of lengths a and b respectively, Since x 
and y commute, we have (xy)* = x‘ y* for any positive integer k. Thus if m is any 
common multiple of a and b, then (xy)” = 1. On the other hand, if (xy)4 = |, the 
fact that the cycles are disjoint yields x4 = 1 = y“. this forces d to be a multiple of 
both a and b. Thus the order of xy is the least common multiple of a and b. 


4.2.2 Even and Odd Permutations 


We begin with a technical result: 
Lemma 4.2.2 [fk and I are non-negative integers, then 


(i) (ab\(ax, ... xp by, ... yy =(ay ... yI)(b x1... XK), and 
(ii) (ab)(a yy ... (Dx, ... XE) = (AX. XR DY, ~~. Vy): 


Proof The permutation on the left side of the first equation sends a to y,, sends yj 
to yj41 fori <1, and y; to a. Similarly it sends b to x;, sends x; to xj4, for j < k, 
and sends x; to b. Thus the left side has been expressed as the product of the two 
disjoint cycles on the right side of the first equation. 

The second equation follows from the first my multiplying both sides of the first 
equation by the 2-cycle (a b). 


Let FinSym(X) denote the group of all finitary permutations of the set of ‘letters’ 
X. Each element 7 of FinSym(X) is expressible as a finite product of disjoint cycles 
of lengths d1, ...d, greater than one, together with arbitrarily many cycles of length 
one. These numbers {d ;} are uniquely determined by 7 since they are the orbit-lengths 
greater than one, of the group (7). Define the sign of 7 to be the number 


JER a edy 4 
| ae co 


Then the function sgn : FinSym(X) — {+1} is well-defined. 


'We will follow Burnside in explaining that commas are introduced into the cycle notation only for 
the specific purpose of distinguishing a 2-or-more-digit entry from other entries. 
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A permutation of Sym(X) which displaces exactly two letters is called a transpo- 
sition. Thus every transposition can be expressed in cycle notation by (a b), for the 
two displaced letters, a and b. Clearly, the sign of a transposition is (—1)?~! = —1. 

We now can state 


Lemma 4.2.3 /f g is a permutation in FinSym(X), and if t is a transposition, then 


sgn(tg) = —sgn(g) = sgn(gt). 


Proof The permutation g is a product of disjoint cycles c; of length d;. Thus 
sgn(g) = |] F (=1)7-", Suppose ¢ is the transposition (a b). If the letters a and 
b are not involved in any of the cycles g;, then the result follows from the formula 
for the sign of gt = tg since we have simply tacked on the disjoint cycle (a b) in 
forming gt. 

Suppose on the other hand, that a and b appear in the same (g)-orbit, say the one 
represented by the cycle g;. Then g; has the form (a x, ...x, by; ... yi), where k 
and / are non-negative. By Lemma 4.2.2, 


tgj =(@y1 ..- W)(D xX ... Xx). 


Thus dj = k +1 +280 sgn(gj) = (—1)*+'+! while sgn(tgj) = (—-D*(—)!, so 
sgn(tgj) = —sgn(g;). Since tg = gi --- (tgj)---gi- ++, the result follows from the 
formula. 

Now suppose a occurs in one cycle, say g1 = (a x1, ... xx) and b occurs in another 
cycle, which we can take to be go = (b y, ... yy). Then by Lemma 4.2.2, 


tg = (t9192)93°** Gn 
= (ayy... DX, ..-Xk)G3~** Gm- 


Thus 


sgn(tg) = sgn(tgiga)] [(-De? 


= (-Dsgn(gi)sgn(g2)J TD? 
—sgn(g). 


So the result follows in this case. The proof is complete. 


Now any cycle (x; ...X,) is a product of transpositions 
(X1 Xk) (XK Xk-1) +++ (43 x2). 


Since each element of FinSym(X) is the product of disjoint cycles we see that 
FinSym(X) is generated by its transpositions. 
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It now follows from Lemma 4.2.3 that no element can be both a product of an even 
number of transpositions as well as a product of an odd number of transpositions. 
Thus we have a partition 


FinSym(X) = At + A7 


of all the elements of FinSym(X) into the set At of all finitary permutations which 
are a product of an even number of transpositions, and A_, all finitary permutations 
which are a product of an odd number of transpositions. Right multiplication by any 
finitary permutation at best permutes the two sets At and A~, and multiplication by 
a transposition clearly transposes them. Thus we have a transitive permutation action 


sgn : FinSym(X) — Sym({AT, A7}). 


The right side is Sym(2) ~ Z, isomorphic to the multiplicative group {1}. 

The kernel of this action is the normal subgroup FinAlt(X) := A‘, the finitary 
alternating group. Its elements are called the even permutations. The elements in the 
other coset A™, are called odd permutations. These terms are used only for finitary 
permutations. If X is a finite set, there is no distinction gained by singling out the 
finitary permutations, and the prefix “Fin” is dropped throughout. So in that case the 
alternating group is denoted Alt(X) or Alt(7) when |X| =n. 

The factorization of a cycle of length n given above shows that any cycle of even 
length is an odd permutation and any cycle of odd length is an even permutation. It 
only “sounds” confusing; by the formula, the sign of an n-cycle is (—1)"~!. 


4.2.3 Transpositions and Cycles: The Theorem of Feit, Lyndon 
and Scott 


Recall that a transposition of a symmetric group Sym(X) is an element which dis- 
places exactly two of the letters of the set X. These two letters must clearly exchange 
places, otherwise we are dealing with the identity permutation. 

Let T be any collection of transpositions of Sym(X). We can then construct a 
simple” graph G(T) := (X, 7) with vertex set X and edge set consisting of the 
transposed 2-subsets determined by each involution in T. 


7In the context of graphs (as opposed to groups) “simple” actually means something rather simple. 
A simple graph is one which is undirected, without loops or multiple edges. That is, edges connect 
only distinct vertices, there is at most one edge connecting two distinct vertices, and no orientation 
to any edge. In other words, edges are 2-subsets of the vertex set. 
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Lemma 4.2.4 Let T be a set of transpositions of Sym(X). Suppose G(T), the sub- 
group generated by the transpositions of T, acts on X with orbits X g of finite length. 
Then G(T ) can be expressed as a direct sum: 


G(T) ~ B,_Svm(Xo) 


where X, are the G(T )-orbits on X. 


Proof First of all, since transpositions displace just two letters, G(T) is a finitary 
permutation group on X. Let X, be a connected component of the graph G(T) and 
suppose x and y are two of its vertices.. Since there is a sequence of transpositions 
defining the edges of a path in X,, connecting x and y, the product of these transpo- 
sitions in the order they occur in the path is an element of G(T) taking x to y. On the 
other hand, there is no such path connecting a vertex in one connected component 
of G(Z) with a vertex in another connected component, and so no element of the 
group G(Z) can move one of these vertices to the other. Thus the connected com- 
ponents X, are actually the G(T )-orbits. Let 7, be the set of transpositions which 
form an edge of the connected component X, of G(T), and let G, be the subgroup 
they generate. Then G, is transitive on X,, but fixes each other orbit point-wise. It 
follows that any conjugate g~'!G,g in G(T), displaces only vertices in X,, and so is 
a subgroup G,, since it is generated by transpositions in TJ,. Thus each G, is normal 
in G(Z) and if X; and X, are distinct orbits, we have G; 1 G, = 1. Since G(T) 
is finitary, each element of the group is uniquely determined by its action on each 
orbit X,,. Conversely, if G, ~ Sym(X,), we see that any product of permutations 
displacing only finitely many letters, is a product of involutions, and so is an element 
of G(T). It then follows that G(T) is the direct sum of the groups 


So all that remains is to show that G, acts on X, as the full symmetric group 
on the finite orbit X,,. Thus, without loss of generality, we assume that G(T) is 
transitive on the finite set X of cardinality n and that the transpositions T define a 
connected graph on X. If |J| = 1, then |X| = 2 and there is nothing to prove. We 
may now proceed by induction on |J| and conclude that G(T) is a tree. Now we 
are free to choose the transposition tf € J so that as an edge of the graph G(T), 
t connects an “end point” a of valence one, and a vertex b. Thus G(T — {t}) is a tree 
on the vertices X — {a}, and so by induction the stabilizer in G(7) of the vertex a 
has order (n — 1)!, and so as G(T) is clearly transitive, it has order n!. Since it is 
faithful, (for its elements fix every single letter of every other orbit), we have here 
the full symmetric group. 


Theorem 4.2.5 (Feit, Lyndon and Scott) Let T be a minimal set of transpositions 
generating Sym(n). Then their product in any order is an n-cycle. 
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Proof We have seen from above that the minimality of J implies that the graph 
G(T) is a tree. 

Let a = tt2---t,—1 be the product of these transpositions in any particular order. 
Consider ¢; and remove the edge ¢; from the graph G(Z) to obtain a new graph G’ 
which is not connected, but has two connected component trees G; and G2, connected 
in G by the “bridging” edge t;. Now by an obvious induction any order of multiplying 
the transpositions of 

T; := {s € T|sis an edge inG;}, 


i=1,2,..., yields ann -cycle. Thus 


a=t(x1,...x)O1,--- 0); 


where the second cycle b is the product of the transpositions of 7, as they are 
encountered in the factorizationa = f, --- t,, and the third cycle c above is the product 
of the transpositions of 7> in the order in which they are encountered in the product 
given for a. (Recall that every transposition in J; commutes with every transposition 
in 72.) Now these two cycles involve two separate orbits bridged by the transposition 
t,;. That the product t;bc is an n-cycle follows directly from Lemma 4.2.2 Part (ii) 
above. The proof is complete. 


This strange theorem will bear fruit in the proof of the Brauer-Ree Theorem, 
which takes a bow in the chapter on generation of groups (Chap. 6). 


4.2.4 Action on Right Cosets 


Let H be any subgroup of the group G, and let G/H be the collection{ Hg|g € G} of 
all right cosets of H in G. Note that we have extended our previous notation. G/H 
used to mean a group of right cosets of a normal subgroup H of G. It is still the same 
set (of right cosets of H in G) even when H is not normal in G. But it no longer has 
the structure of a group. Nonetheless, as we shall see, it still admits a right G-action. 

For any element g of G and coset Hx in G/H, Hxg is also a right coset of H in 
G. Moreover, Hxg = Hyg implies Hx = Hy, while Hx = (Hxg7')g. Thus right 
multiplication of all the right cosets of H in G by the element g induces a bijection 7, : 
G/H —> G/H. Moreover for elements g, h € G, the identity Hx(gh) = ((Hx)g)h, 
forced by the associative law, implies 7,7, = Tgp as right operators on G/H. Thus 
we have a group action 

TH : G — Sym(G/H) 


which takes g to the permutation 7,, induced by right multiplication of right cosets 


by g. This action is always transitive, for coset Hx is mapped to coset Hy by 7,-1,. 
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4.2.5 Equivalent Actions 


Suppose a group G acts on sets X and Y—that is, there are group homomorphisms 
fx : G — Sym(X) and fy : G — Sym(Y). The two actions are said to be 
equivalent actions if and only if there is a bijection e : X — Y such that for each 
letter x € X, and element g € G, 


e(x tx) = (e(x)) 7, 


That is, the action is the same except that we have used the bijection e to “change 
the names of the elements of X to those of Y”. 


4.2.6 The Fundamental Theorem of Transitive Actions 


First we make an observation: 


Lemma 4.2.6 Let f : G > Sym(X) be a transitive action of the group G on the set 
X. For each letter x € X, let Gx be the subgroup of all elements of G which leave 
the letter x fixed—i.e. Gy := {g € G|xI = x}. 


(i) Ifx,y € X, then Gy and Gy are conjugate subgroups of G. Precisely, Gy = 
g~'Gxg whenever x9 = y. 

(ii) The set {g € G|x9 = y} ofall elements of G taking x to y, is a right coset of 
Gy. 


Proof (i) If x9 = y, yo Gx9 = x99 6x9 = xGx9 = XI = 180g Cg CGy. 
But since x = yo, we have Gy C€ gG.g~', by the same token. Thus the first 
containment is reversible and g~'G,.g = G ye 

(ii) If x7 = y then clearly, x" = y for all h € Gyg. Conversely, if x9 = x" = y, 
then gh! € Gy, so Gyg = Gh. 


Theorem 4.2.7 (The Fundamental Theorem of Transitive Group Actions) Suppose 
f : G — Sym(X) is a transitive group action. Then for any letter x in X, f is 
equivalent to the action 


TG, 1 G > Sym(G/G,x), 
of G on the right cosets of Gx by right multiplication. 


Proof In order to show the equivalence, we must construct the bijection e : X > 
G/G, compatible with both actions. For each letter y € X set e(y) := {g € G|xI = 
y}. Lemma 4.2.6 informs us that the latter is a right coset G,h of G,. Now for any 
element g in G, e(y%) is the set of all elements of G which take x to y¥. This clearly 
contains G,.hg; yet by Lemma 4.2.6, it must itself be a right coset of Gy. Thus e(y?) 
is the right multiple e()g, establishing the compatibility of the actions. 
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Corollary 4.2.8 


(i) If G acts on the set X, then every G-orbit has length dividing the order of G. 
(ii) If G acts transitively on X, then for any subgroup H of G which also acts 
transitively on X, one has G = Gx H, for any x € X. 


Proof (i) The action of G on any G-orbit O is transitive. Applying the fundamental 
Theorem 4.2.7, we see that the action on O is equivalent to the action of Gon G/G,, 
the set of right cosets of G,, where x is any fixed element of O. Thus |O| = |G/G, | 
for any letter x in O. 

(ii) To say that subgroup #7 is transitive on O, says that right multiplication of one 
such coset by the elements of H yields all such cosets. Since these cosets partition 
all the elements of G, one obtains G, H = G. 


4.2.7 Normal Subgroups of Transitive Groups 


If G acts transitively on a set X, we say that G acts regularly on X if and only if for 
some x € X,G, = 1. 


Lemma 4.2.9 Suppose G acts transitively on X. 


(i) If N is anormal subgroup of G, then all N-orbits on X have the same length. 
(ii) In particular, if N is a non-identity normal subgroup of G lying in a subgroup 
G,, for some x in X, then N acts trivially on X, and the action is not faithful. 
(iii) The faithful transitive action of an abelian group is always a regular action. 
(iv) Suppose N is a normal subgroup of G which acts regularly on X. Then for 
any x € X, G = GN, with Gy A N = 1, and the action of Gy on X — {x} 
(by restriction) is equivalent to that action of G;, on N — {\}, the set of all 
non-identity elements of N, induced by conjugation by the elements of G.. 


Proof (i) For any g € G and x € X, the mapping 
XN > (XN)9 HAI HIN = (9) 


is a bijection of N-orbits. Since G is transitive, all N-orbits can be gotten this way. 
(ii) This is immediate from (i), since N has an orbit of length 1. 
(iii) If A is an abelian subgroup acting transitively on X, for any x € X, A, acts 
trivially on X by part (ii) Since A acts faithfully, A, = 1, so A has the regular action. 
(iv) That G = G,N and G; NN = 1 follows from Lemma 4.2.8, part (11), and 
the fact that N is regular. 
Since the normal subgroup N is regular, there is a bijection, ey : N > X =x% 
which sends element n € N to x”. For each element h € G,, 


1 


e(h'nh) = x nh _ nh = (xy = e(n)". 
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So e defines an equivalence of the action of G, on N by conjugation and the action 
of G, on X obtained by restriction of the action of G. Throwing away the G,-orbit 
{x} of length 1 yields the last statements of (iv). 


4.2.8 Double Cosets 


Let H and K be subgroups of a group G. Any subset of the form HK is called an 
(H, K)-double coset. Such a set is, on the one hand, a disjoint union of right cosets 
of H, and on the other, a disjoint union of left cosets of K. So it’s cardinality must be 
acommon multiple of | H| and |K|. The precise cardinality is given in the following: 


Lemma 4.2.10 Let H and K be subgroups of G. 


(i) |HgK| = |A|-[K : Kn g7'Hgl = |H||K|/|gKg7! OA\|=[H: An 
gKg"'IIK\. 
(ii) The (H, K)-double cosets partition the elements of G. 


Proof Intheaction 7, of G on G/H by right multiplication, the K -orbits are precisely 
the right cosets of H within an (H, K)-double coset. Since G/K is partitioned by 
such orbits, and the right cosets of H partition G, part (ii) follows. 

The length of a K-orbit on G/H, is the index of its subgroup fixing one of its 
“letters”, say Hg. This subgroup would be {k € K|Hgk = Hg} = K NgHg"!. 
Since each, right coset of H has |H| elements of G, the second and third terms of 
the equations in Part (i) give |HgqK|. The last term appears from the symmetry of H 
and K. 


4.3 Applications of Transitive Group Actions 


4.3.1 Cardinalities of Conjugacy Classes of Subsets 


For any subset X of a group G, the normalizer of X in G is the set of all elements 
Nc(X) := {g € Glg7'!Xg = X}. Now G acts transitively by conjugation on 
X° = {g-!Xg|g € G}, the set of distinct conjugates of X, and the normalizer 
NG(X) is just the subgroup fixing one of the “letters” of X°. Thus 


Lemma 4.3.1 


(i) Let X be a subset of G. The cardinal number |X°| of distinct conjugates of X 
in G is the index [|G : Ng(X)] of the normalizer of X in G. This holds when X 
is a subgroup, as well. 

(ii) The cardinality of a conjugacy class x© in G is the index of the centralizer Cg (x) 
in G. 
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4.3.2 Finite p-groups 


A finite p-group is a group whose order is a power of a prime number p. The trivial 
group of order | is always a p-group. 


Lemma 4.3.2 


(i) Ifa finite p-group acts on a finite set X of cardinality not divisible by the prime 
D, then it fixes a letter. 
(ii) A non-trivial finite p-group has a nontrivial center. 
(iii) If H is a proper subgroup of a finite p-group P, then H is properly contained 
in its normalizer in P, thatis H < Np(A). 


Proof (i) Let P be a finite p-group, say of order p”. Then any P-orbit on X has 
length a power of p. Thus if no orbit had length 1, p would divide |X|, since X 
partitions into such orbits. But as p does not divide |X|, some P-orbit must have 
length 1, implying the conclusion. 

(ii) Let P be as in part (i), but of order p” > 1. Now P acts by conjugation on the 
set P — {1} of p” — 1 non-identity elements. Since p does not divide the cardinality 
of this set, part (i) implies there is a non-identity element z left fixed by this action. 
That is, g~!zg = z, for all g € P. Clearly z is a non-identity element of the center 
Z(P), our conclusion. 

(iii) Suppose H is a proper subgroup of a p-group P. By way of contradiction 
assume that H = Np(#). Then certainly, Z(P) < H. By Part (ii), Z(P) € 1. If 
H = Z(P), then H < P,so P = Np(A) = H, acontradiction to H being a proper 
subgroup. Thus, under the homomorphism 


f:P—> P:=P/Z(P), 
H maps to a non-trivial proper subgroup f(H) := H of the p-group P . By induc- 
tion on | P|, Np(H) properly contains H. Thus the preimage of N(H), which is 
f (Np(4)), properly contains H = f~'(A). But this preimage is 


f (Np(A)) = {x € Pla (A/Z(P)x = H/Z(P)} = Np(A) < A, 


and this contradicts its proper containment of H. The proof is complete. 


4.3.3 The Sylow Theorems 


Theorem 4.3.3 (Sylow’s Theorem) Let G be a group of finite order n and let k be 
the highest power of the prime p such that p* divides the group order n. Then the 
following three statements hold: 


(i) (Existence) G contains a subgroup P of order p*. 
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(ii) (Covering and Conjugacy) Every p-subgroup R of G is contained in some 
conjugate of P. In particular, any subgroup of order p* is conjugate in G to P. 
(iii) (Arithmetic) The number of subgroups of order p* is congruent to 1 mod p. 


Proof (i) Let denote the collection of all subsets of G of cardinality p*. The number 


: n ; 
of such subsets is |X| = ( k ) Since, for any natural number r < p*, the numbers 


n —r and p* —r are both divisible by the same highest power of p, the number 
|X| is not divisible by the prime p. Now G acts on & by right multiplication— 
that is, if X € X then Xg € &, for any group element g. Thus » partitions into 
G-orbits under this action, and not all of these orbits can have length divisible by 
Pp, since p doesn’t divide |X|. Let X; be such a G-orbit of length not divisible by 
p. By the Fundamental Theorem of Transitive Group Actions (Theorem 4.2.7), the 
number of sets in 1 is the index of the subgroup Gy fixing the “letter” X in %}. 
Since n = |G| = [G: Gx]|Gx| = |21||Gx|, we see that pk divides |Gx|. But 
by definition, XGy = X, so X is a union of left cosets of Gx; so |Gx| divides 
|X| = p*. It follows that Gy has order p* exactly. So Gy is the desired subgroup P 
of statement (i). 

(ii) Let R be any p-subgroup of G. Then R also acts on X by right multiplication. 
Since p does not divide ||, Lemma 4.3.2, part (i), shows that R must fix a letter Y 
in &;. Thus R < Gy. But as G 1s transitive on ©, Lemma 4.2.6 shows that Gy is 
conjugate to P = Gy. Thus R lies in a conjugate of P as required. 

(iii) Now let S denote the collection of all subgroups of G having order p* exactly. 
By parts (i) and (iii), already proved, S is non-empty, and G acts transitively on S 
by conjugation. Thus by Lemma 4.3.1 part (i), |S| = [G : N], where N := NgG(P), 
the normalizer in G of a subgroup P in S. Now P itself acts on S by conjugation, 
fixing itself, and acting on S — {P} in orbits of p-power length. Suppose {R} were 
such a P-orbit in S — {P} of length one. Then P normalizes R so PR is a subgroup 
of G. Then |PR| = |P|-[R: P| R], (Theorem 3.4.5, part (i1)) a product of pk 
and [R : PM R], another power of p. Since |P R| divides n and p* is the largest 
p-power dividing n, one must conclude that [R : P 1 R] = 1, which forces P = R, 
contrary to the choice of R in S — {P}. Thus P acts on S — {P} in orbits of lengths 
divisible by p. This yields |S| = 1 mod p. Thus all parts of Sylow’s theorem have 
been proved. 


The subgroups of G of maximal p-power order are called the p-Sylow subgroups 
of G, and the single conjugacy class which they form is denoted Syl, (G). 


Corollary 4.3.4 (The Frattini Argument) Suppose N is a normal subgroup of the 
finite group G, and select P € Syl,(N). Then G = NG(P)N. 


Proof We know that conjugation by elements of G induces automorphisms of 
the normal subgroup N. Since Syl,(N) is the full collection of subgroups of 
N of their order, G acts on Syl,(N) by conjugation. But by Sylow’s Theorem 
(Theorem 4.3.3) the subgroup N is already transitive on Syl,,(N). The result now 
follows from Lemma 4.2.8, part 2. 
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Corollary 4.3.5 


(i) Any normal p-Sylow subgroup of a finite group is in fact characteristic in it. 

(ii) Any finite abelian group is the direct product of its p-sylow subgroups, that is, 
A S| x S2--- x Sy, where S§; is the p;-Sylow subgoup of A, and {p\,..., Pn} 
lists all the distinct prime divisors of | A| exactly once each. Moreover, we have: 


Aut(A) ~ Aut(S;) x Aut($2) x --- , Aut(S,). 
(iii) The Euler phi-function is multiplicative, that is, 


p(n) = o(a) Pb) 
whenever n = ab and gcd(a, b) = 1. 


Proof (i) Any normal p-sylow subgroup is the unique subgroup of its order, and 
hence is characteristic. 

(ii) Since, in a finite abelian group, each p-Sylow subgoup is normal, and has 
order prime to the direct product of the remaining r-Sylow subgroups (r # p), A 
is the internal direct product (see Lemma 3.5.1, part (iii)) Since by part (i) above, 
each S; is characteristic in A, each automorphism o of A induces an automorphism 
o; of S;. But conversely, if we apply all possible automorphisms o; to the direct 
factor S; and the identity automorphism to all other p-sylow subgroups $j, j 4 i, 
we obtain a subgroup B; of Aut(A). Now the internal direct product characterization 
of Lemma 3.5.1 applies to the B; to yield the conclusion. 

(iii) Applying part (i1) of this Corollary when A is the cyclic group of order n, the 
group Z,, one obtains 


on) =[] on") 


when x has prime factorization n = at pe - ++ pn", upon equating group orders. 


4.3.4 Fusion and Transfer 


Lemma 4.3.6 (The Tail-Wags-the-Dog Lemma) Suppose T = (V, E) is a bipartite 
graph with vertices in two parts V, and V2, and each edge of E involving one vertex 
from V, and one vertex from V2. Suppose G is a group of automorphism of the graph 
Tl’ acting on the vertex set V with V, and V2 as its two orbits. Then the following 
conditions are equivalent. 


(i) G acts transitively on the edges of T. 
(ii) For each vertex v1 € Vi, the subgroup G, of G fixing v acts transitively on 
the edges on v\ 


120 4 Permutation Groups and Group Actions 


(iii) For each vertex v2 in V2, the subgroup G2 of G fixing v2 acts transitively on 
the edges of Ton vertex v2. 


Proof By the symmetry of the Vj, is suffices to prove the equivalence of (i) and 
(ii) If G transitively permutes the edges (as in (i)), any edge on vj must be moved 
to any other edge on v; by some element g. But since v, is the only vertex of the 
G-orbit V; incident with these edges, such an element g fixes vy. So (ii) holds. 
Conversely, suppose (ii), so that the subgroup Gj transitively permutes the edges on 
v1. Since G is transitive V;, any edge meeting Vj at a single vertex can be taken to 
any other. But by hypothesis, all edges have this property and (1) follows. The proof is 
complete. 


Theorem 4.3.7 (The Burnside Fusion Theorem) Let G be a finite group, and let 
X 1 and X2 be two normal subsets of a p-sylow subgroup P of G. Then X, and X2 
are conjugate in G if and only if they are conjugate in the normalizer in G of P. In 
particular, any two elements of the center of P are conjugate if and only if they are 
conjugate in Ng(P). 


Proof Obviously, if the X; are conjugate in NgG(P), they are conjugate in G. So 
assume the X; belong to X := X a We now form a graph whose vertex set is X US 
where S := Syl,,(G). An edge will be any pair (Y, R) € X x S for which the subset Y 
is anormal subset of the p-sylow group R. Then G acts by conjugation on the vertex 
set V with just two orbits, X and S (the first by construction, the second by Sylow’s 
Theorem). Moreover the conjugation action preserves the normal-subset relationship 
on X x S, and so G acts as a group of automorphisms of our graph, which is clearly 
bipartite. Now by Sylow’s theorem, the subgroup G; := Ng(X}1) acts transitively 
by conjugation on Syl,,(G1). But since X; is normal in some p-sylow subgroup 
(namely P), 


Syl, (G1) € Syl,(G) 


Thus Syl,,(G1) are the S-vertices of all the edges of our graph which lie on vertex X}. 
We now have the hypothesis (ii) of the Tail-Wags-the-Dog Lemma. So the assertion 
(iii) of that lemma must hold. In this context, it asserts that the stabilizer G2 of a 
vertex of S—for example, Ng(P)—is transitive on the members of X which are 
normal in P. But that is exactly the desired conclusion. 


Theorem 4.3.8 (The Thompson Transfer Theorem) Suppose N is a subgroup of 
index 2in a 2-sylow subgroup S of a finite group G. Suppose t is an involution in 
S — N, which is not conjugate to any element of N. Then G contains a normal 
subgoup M of index 2. 


Proof Let G act on G/N, the set of right cosets of N in G, by right multiplication. 
In this action, the involution ¢ cannot fix a letter, for if Nxt = Nx, then xtx—! € N, 
contrary to assumption. Thus f¢ acts as [G : S$] 2-cycles on G/N. Since this makes t¢ 
the product of an odd number of transpositions in Sym(G/N), we see that the image 
of G in the action f : G + Sym(G/N) intersects the alternating group Alt(G/N) 
at a subgroup of index 2. Its preimage M is the desired normal subgroup of G. 
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4.3.5 Calculations of Group Orders 


The Fundamental Theorem of Transitive Group Actions is quite useful in calculating 
the orders of the automorphism groups of various objects. We peruse a few examples. 

(Petersen’s Graph) The reader is invited to verify that the two graphs which appear 
in Fig. 4.1 are indeed isomorphic (vertices with corresponding numerical labels are 
matched under the isomorphism). Presenting them in these two ways reveals the 
existence of certain automorphisms. 

In the left hand figure, vertex 1 is adjacent to vertices 2, 3, and 4. The remaining 
vertices form a hexagon, with three antipodal pairs, (5, 8), (6, 9) and (7, 10), which 
are connected to 2, 4 and 3, respectively. Any symmetry of the hexagon induces a 
permutation of these antipodal pairs, and hence induces a corresponding permutation 
of {2,3,4}. Thus, if we rotate the hexagon clockwise 60 degrees, we obtain an 
automorphism of the graph which permutes the vertices as: 


y = (1)(23 4)(5 109876). 
But if we reflect the hexagon about its vertical axis we obtain 
t= (1)(4)(2 3)(7 5)(8 10)(6) (9). 
Now any automorphism of the graph which fixes vertex | must induce some sym- 
metry of the hexagon on the six vertices not adjacent to vertex 1. The automorphism 
inducing the identity, preserves the three antipodal pairs, and so fixes vertices 2,3 
and 4, as well as vertex 1—1.e. it is the identity element. Thus if G is the full auto- 


morphism group of the Petersen graph, G, is the dihedral group (y, ft) of order 12, 
acting in orbits, {1}, {2, 3, 4}, {5, 6, 7, 8, 9, 10}. 


bt [Sd 
sn SY 


2 


Fig. 4.1 Two views of Petersen’s graph. The positions labeled 1-10 are the vertices, and arcs 
connecting two vertices represent the edges. (Note that since the graph is non-planar, it is drawn 
with apparent intersections of arcs, which, if unlabeled, do not represent vertices) 
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On the other hand, the right hand figure in Fig. 4.1 reveals an obvious automor- 
phism of order five, namely 


u = (12564)(38 1079). 


That G is transitive comes from overlaying the three orbits of Gj, and the two orbits 
of (uz) of length five. Together, one can travel from any vertex to any other, by hopping 
a ride on each orbit, and changing orbits where they overlap. This forces [G : G;] = 
10, the number of vertices. Since we have already determined |G | = 12, we see 
that the full automorphism group of this graph has order |G| = [G : G,]|Gi| = 
10-12 = 120. 

(The Projective Plane of Order 2.) Now consider a system P of seven points, which 
we shall name by the integers {0, 1, 2,3, 4, 5, 6}. We introduce a system £ of seven 
triplets of points, which we call lines: 


= [1,2,4] 
ie 12.3.5) 
i= 3.4.5) 
La: = [4,5, 0] 
Ls: =[5,6, 1] 
Lo : = [6, 0, 2] 
Lo: = [0, 1, 3] 


Notice that the first line LZ; is the set of quadratic residues mod 7, and that the 
remaining lines are translates, mod 7, of L;— that is, each L; has the form L;+(i—1), 
with entries in the integers mod 7. 

This system is also depicted by the six straight lines and the unique circle in 
Fig. 4.2. (The reader need only check that the lines in each system are labeled iden- 
tically.) Automorphisms of the system (P, £) are those permutations of the seven 
points which take lines to lines—i.e. preserve the system £. We shall determine the 
order of the group G of all automorphisms. First, from the way lines were defined 
above, as translates mod 7 of L1, we have an automorphism 


u= (0123456): (Li lb L3 La Ls Lo Lo). 


Fig. 4.2. The configuration 1 


of the projective plane of i 
2 Ko 3 
SE 


order 2 


4 
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Thus G is certainly transitive on points. Thus the subgroup G; fixing point | has 
index 7. There are three lines on point 1, namely, Lo, L; and Ls. There is an obvious 
reflection of the configuration of Fig. 4.2 about the axis formed by line L5 = [1, 6, 5], 
giving the automorphism 


s = (1)(6)(5)(0 4) 2): (Lo L1)(L3 Lo)(L2)(La) (Ls). 


But not quite so obvious is the fact that t = (0)(1)(3)(2 6)(4 5) induces the per- 
mutation (Lo)(L; Ls5)(L2 L3)(L4)(L¢6) and so is an automorphism fixing 1 and Ls 
and transposing Lo and L;. Thus Gj acts as the full symmetric group on the set 
{Lo, L1, L5} of three lines on point 1. Consequently the kernel of this action is a 
normal subgroup K of G1, at index 6. Now K stabilizes all three lines {Zo, L1, Ls}, 
and contains 


r= (0 3)0)2 4)(5)() : (Lo)(L1)(L2 L4)(L3 Lo) (Ls). 


So K acts as a transposition on the remaining points of Lo beyond point 1. Thus K 
in turn has a subgroup Ky, fixing Lo pointwise, and stabilizing L; and Ls. Ky, in 
turn, contains 


v = O)M)2 4)(3) 6) : (Lo)(L1) (Ls) (L2 £3)(L4 Leo). 


It in turn contains a subgroup of index 2 whose elements now fix 0, 1, 3,5 and 6. 
One can see that any such element stabilizes every line (for every pair of fixed points 
determines a fixed line), and hence stabilizes all intersections of such lines, and hence 
fixes all points. Thus K, has order only 2. We now have 


IG) = 1G 2 GilG) : KUK 2 Kil Ki| S 76+ 2-2 = 168. 


4.4 Primitive and Multiply Transitive Actions 
4.4.1 Primitivity 


Suppose G acts on a set X. A system of imprimitivity for the action of G is a non- 
trivial partition of X which is stabilized by G. Thus X is a disjoint union of two or 
more proper subsets X;, not all of cardinality one, such that foreach g € G, X : = Xj 
for some—that is, G permutes the components of the partition among themselves. 
Of course, if G is transitive in its action on X, the components X; all have the 
same cardinality (evidently one dividing the cardinality of X), and are transitively 
permuted among themselves. 

As an example, the symmetric group Sym(7) acts on all of its elements by right 
multiplication (the regular representation). But these elements were partitioned as 
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Sym(n) = At U A7, into those elements which were a product of an even number 
of transpositions (A*) and those which were a product of an odd number of trans- 
positions (A). Right multiplication permutes these sets, and they form a system of 
imprimitivity when n > 2. 

We have seen that when JN is a non-trivial normal intransitive subgroup of a 
group G acting faithfully on X, then the N-orbits form a system of imprimitivity. 
(Lemma 4.2.9 considers the case of N-orbits when G acts transitively on X.) 

If a transitive group action preserves no system of imprimitivity, it is said to be a 
primitive group action. (Often one renders this by saying that G acts primitively.) We 
understand this to mean that primitive groups are always transitive. The immediate 
characterization is this: 


Theorem 4.4.1 Suppose G acts transitively on X, where |X| > 1. Then G acts 
primitively if and only Gx is a maximal subgroup of G for some x in X. 


Proof Since G is transitive, all the subgroups G,y, x € X are conjugate in G (see 
Lemma 4.2.6, part 1). By the fundamental theorem of group actions (Theorem 4.2.7), 
we may regard this action as that of right multiplication of the right cosets of G,. 
Now if G, < H < G, then each right coset of H is a union of right cosets of Gy. 
In this way the rights cosets of H partition G/G, to form a system of imprimitivity. 
Thus in the imprimitive case, no such H exists, so Gx is G or is maximal in G. The 
former case is ruled out by |X| > 1. 

Conversely, if X = X; + X2+--- is a system of imprimitivity, then all the X; 
have the same cardinality and for x € X1, G, stabilizes the set X;. But by transitivity 
of G, if L = Stabg(X1) is the stabilizer in G of the component X,, then L must act 
transitively on X,. Since |X| 4 1, G, is a proper subgroup of L. Since X; 4 X, L 
is a proper subgroup of G. Thus transitive but imprimitive action, forces G, not to 
be maximal. That is, G, maximal implies primitive action of G on G/G, and hence 
X. The proof is complete. 


We immediately have: 


Corollary 4.4.2 [f G has primitive action on X, then for any normal subgroup N 
of G either (i) N acts trivially on X (so either N = | or the action is not faithful), 
or (ii) N is transitive on X. 


Proof From Lemma4.2.9, p. 99, the N-orbits either form a system of imprimitivity 
or form a trivial partition of X. The former alternative is excluded by our hypothesis. 
Therefore the partition into N-orbits is trivial—that is, (i) or (ii) holds. 


4.4.2 The Rank of a Group Action 


Suppose G acts transitively on a set X. The rank of the group action is the number of 
orbits which the group G induces on the cartesian product X x X. Here an element 


4.4 Primitive and Multiply Transitive Actions 125 


g of G takes the ordered pair (x, y) of X x X to (x9, y%). Since G is transitive, one 
such orbit is always the diagonal orbital D = {(x,x)|x € X}. The total number of 
such “orbitals” —as the orbits on X x X are called—is the rank of the group action. 

Basically, the non-diagonal orbitals mark the possible binary relations on the set 
X which could be preserved by G. Such a relation is defined by asserting that letter 
x has relation R; to letter y (which we write as x R; y) if and only if the ordered pair 
(x, y) is a member of the non-diagonal orbital O; (G-orbit on X x X.) We could 
then represent this relation by a directed graph, [; = (X, E;), where the ordered 
pair (x, y) is a directed edge of E; if and only if x R; y holds. In that case G acts as 
a group of automorphisms of the graph G;—that is, G < Aut(Ij). 

If Oj; is an orbital, there is also an orbital O* consisting of all pairs (y, x) for 
which (x, y) belongs to Oj. Clearly O; has the same cardinality as O;. 

The orbital O; is symmetric if and only if (x, y) € O; implies (y, x) € O;, that is 
O; = OF. In this case, the relation R; is actually asymmetric relation andif O; ~ D, 
then the graph I’; is a simple undirected graph.* Thus representing groups as groups 
of automorphisms of graphs, has a natural setting in permutation groups of rank 3 or 
more. 

Now fix a letter x in X. If (u, v) is in the orbital O;, there exists, by transitivity, 
an element g such that uw? = x. Thus the orbital O; possesses a representative 
(x, v’), with first coordinate equal to x. Moreover, if (x, v) and (x, w) are two such 
elements of O; with first coordinate x, the fact that O; is a G-orbit, shows that 
there is an element / in G, taking (x, u) to (x, w). That is, wu and w must belong 
to the same G,-orbit on X. Thus there is a one-to-one correspondence between the 
orbitals of the action of G on X and G,-orbits on X. The lengths of these orbits 
are called the subdegrees of the permutation action of G on X. By the Fundamental 
Theorem of Group Actions (Theorem 4.2.7), these G,.-orbits on X (or equivalently, 
G/G,.) correspond to (Gx, G;)-double cosets of G, and the length of G,gG, is 
[Gx : Gy Ng7'Gyg]. 

We summarize most of this in the following; 


Theorem 4.4.3 Suppose G acts transitively on X. 
(i) Then the rank of this action is the number of G,.-orbits on X. 
(ii) The lengths of these G,-orbits are called subdegrees and are the numbers 


[G.: G,Ng'G,g] 


as g ranges over a full system of (Gx, Gx)-double coset representatives. 

(iii) The correspondence of an orbital O; and a (G,, G,)-double coset is that 
(G,,G,h) € O; if and only coset Gh is in the corresponding (Gx, Gx)- 
double coset, namely G,hG,. Then 


|Oi| = |X|[Gx : Ge Ng7'Gxgl. 


3For graphs, the word “simple” means is has no multiple edges (of the same orientation) or loops. 
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Clearly O; and O* correspond to Gy-orbits of the same size. In particular, if 
an orbital is not symmetric, then there must be two subdegrees of the same size. 

(iv) Ifadouble coset Gy. gG,. contains an involution t, then the corresponding orbital 
must be symmetric. 


Remark All but item (iv) was developed in the discussion preceding. If ¢ is an 
involution in G,gG;,, then ¢ inverts the edge (G;, Gt) in the graph Ij. 


Let’s look at a few simple examples. 


Example 27 (i) (Grassmannian Action) Suppose V is a finite-dimensional vector 
space of dimension at least 2. As we have done before, let P,(V) be the col- 
lection of all k-dimensional subspaces, where k < (dim V)/2. Then the group 
GL(V) acts transitively on P,(V), as noted earlier. But because of the inequal- 
ity on dimensions, a given k-subspace W may intersect other k-subspaces at any 
dimension from 0 to k. Moreover, the stabilizer H in GL(V) of the subspace W 
is transitive on all further k-subspaces which meet W at a specific dimension. 
Thus we see that GL(V) acts on Py(V) with an action which is rank k + 1. 

(ii) (Symmetric Groups) If k < n/2, the group G = Sym(n), transitively permutes 
A(k), the k-subsets of A = {1,2,...,m}. Again, the stabilizer H of a fixed 
k-subset U, is transitive on the set of all k-subsets which meet U in a subset of 
fixed cardinality. Thus G = Sym(n) has rank k + | in its action on all k-subsets. 
For example, G = Sym(7), acts faithfully on the 35 points of A(3), as a rank 
four group with subdegrees 1, 12, 18, 4. 

G = Sym(6), acts as a rank 4 group on the 20 letters of A(3), with subdegrees 
1,9,9, 1. 
Similarly, G = Sym(5), acts as a rank 3 group on the 10 letters of A(2), with 
subdegrees 1, 3, 6. If O; is the orbital of 30 ordered pairs corresponding to the 
subdegree 3, then, as this orbital is symmetric, it contributes 15 undirected edges 
on the 10 vertices, yielding a graph ['; = (A(2), E;) isomorphic to Petersen’s 
graph, whose automorphism group of order 120 we met in the last section. 

(iii) We have mentioned that G turns out to be a subgroup of the automorphism 
group of the graph Ij. It need not be the full automorphism group, and whether 
it is or not depends on which orbital O; one uses. For example, we have just 
seen above that the group G = Sym(7), has a rank 4 action on the 35 3-subsets 
chosen from the set A of 7 letters, with subdegrees 1, 12, 18, 4. Now it happens 
that the graph [3 corresponding to the subdegree 18 has diameter 2, and its full 
automorphism group is Sym(8). 

(iv) This might be a good time to introduce an example of a rank 3 permutation 
group with two non-symmetric orbitals. Consider the group G = AQ of all 
affine transformations, 
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x — sx + y where x € Zp, s a square, 


where Z, denotes the additive group of integer residue classes modulo a prime 
p = 3mod4. Here A denotes the additive group of translations {ty : x > 
x+y,x € Zp}, and Q denotes the multiplications by quadratic residue classes 
modulo p. Then G acts faithfully on G/Q, with A as a normal regular abelian 
subgroup. Now the group G has order p(p — 1)/2 which is odd. G is rank 
three with subdegrees 1, (p — 1)/2, (p — 1)/2, and the non-diagonal orbitals 
are not symmetric. By Lemma 4.2.9, the graph I’ can be described as follows: 
The vertices are the residue classes [m] mod p. Two of them are adjacent if 
and only if they differ by a quadratic residue class. Since —1 is not a quadratic 
residue here, this adjacency relation is not symmetric. 


A rank 1 permutation group is so special that it is trivial, for the diagonal orbital 
D is the only orbital, forcing |X| = 1. 

A rank 2 group action has just two orbitals, the diagonal and one other. These are 
the doubly transitive group actions which are considered in the next subsection. In 
fact 


Corollary 4.4.4 The following are equivalent: 


(i) G acts as a rank 2 permutation group. 
(ii) G transitively permutes the ordered pairs of distinct letters of X. 
(iii) G acts transitively on X and for some letter x € X, Gy acts with two orbits: 
{x} and X — {x}. 


4.4.3 Multiply Transitive Group Actions 


Suppose G acts transitively on a set X. We have remarked in Sect.3.1 that in that 
case, for every positive integer k < |X|, that G also acts on any of the following sets: 


1. X), the set of ordered k-tuples of elements from X, 
2. X* the set of ordered k-tuples of elements of X with entries pairwise distinct. 
3. X(k), the set of (unordered) subsets of X of size k. 


For k > 1, the first set is of little interest to us; for the group can never act 
transitively here, and what orbits that exist tend to be equivalent to orbits on the other 
two sets. A group G is said to act k-fold transitively on X if and only if it induces 
a transitive action on the set X“* of k-tuples of distinct letters. It is said to have a 
k-homogeneous action if and only if G induces a transitive action of the set of all 
k-sets of letters. 


Lemma 4.4.5 (i) If G acts k-transitively on X, then it acts k-homogeneously on 
X. That is, k-transitivity implies k-homogeneity. 

(ii) For\ <k < |X|, a group which acts k-transitively on X acts (k—1)-transitively 
on X. 
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(iii) For 1 <k <|X|-— 1,a group G which acts transitively on X, acts k-fold tran- 
sitively on X if and only if the subgroup fixing any (k — 1)-tuple (x1, ..., Xk—1) 
transitively permutes the remaining letters, X — {x,,..., Xx—1}- 


Proof The first two parts are immediate from the definitions. The third part follows 
from Corollary 4.4.4 when k = 2. So we assume 2 < k < |X| — 1, and will apply 
induction. 

Assume X = {1,2,3,...} (the notation is not intended to suggest that X is 
countable; only that a countable few elements have been listed first), and suppose G 
acts transitively on X and that for any finite ordered (k — 1)-tuple (1, 2,..., (kK—1)), 
the subgroup G1, (x—1) that fixes the (k — 1)-tuple point-wise, is transitive on the 
remaining letters, X — {1,2,..., (k — 1)}. Clearly, the result will follow if we can 
first show that G is (k — 1)-fold transitive on X. Now set U = X —{1,..., (k—2)}. 
The subgroup H := Gj... x—2, of all elements of G fixing {1,..., k —2} point-wise, 
properly contains the group G1, ;—1 just discussed, and has this property: 


(4.43) For every letter u in U, H,, fixes {u} and is transitive on U — {u}. 


Since k < |X| — 1 causes |U| > 3, the presented property implies H acts 
transitively on U. Thus we have the hypotheses of part (iii) for k — 1, and so we can 
conclude by induction, that G has (k — 1)-fold action on X. Then the hypothesis on 
a k—1 implies the k-fold transitive action. 

The converse implication in part (iii), that k-fold transitive action causes G1, x1 
to be transitive on all remaining letters, is straightforward from the definitions. 


Remark The condition that k < |X| — 1 is necessary. The group Alt(5) acts tran- 
sitively on five letters, and any subgroup fixing four of these letters, is certainly 
transitive on the remaining letters (or more accurately “letter”, in this case). Yet the 
group is not 4-transitive on the five letters: it is only 3-fold transitive. 


Lemma 4.4.6 Suppose f : G — Sym(X) is a 2-transitive group action. Then the 
following statements hold: 


(i) G is primitive in its action on X. 
(ii) Ifx € X, G = Gy + GytG, where t is congruent to an involution mod ker f. 
(iii) Suppose N is a finite regular normal subgroup of G. Then N has no proper 
characteristic subgroups and is a direct product of cyclic groups of prime order 
p. (Such a group is called an elementary abelian p-group). It follows that 
|N| = |X| = p® is a prime power. 


Proof (i) Suppose X, is a non-trivial component of a system of imprimitivity, so 
that 1 ~ |X| and X; # X. For any x € Xj, the subgroup G, stabilizes X;. But G, 
acts transitively on X — {x}, forcing X; = X, a contradiction. So there is no such 
system of imprimitivity, as was to be shown. 

(ii) Since n(n — 1) = |X®*| divides |G/ker f|, the latter number is even. So 
by Sylow’s theorem applied to G/ker f, there must exist an element z of G which 
acts as an involution on X. Then z must induce at least one 2-cycle, say (a, b) on X. 
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From transitivity, we can choose g € G such that a = x. Then t := g~!zq is the 
involution displacing x and can be taken as the double coset representative. 

(iii) By Lemma 4.2.9 part (iv), G,. transitively permutes the non-identity elements 
of N by conjugation. Since the conjugation action, induces automorphisms of NV, N 
does not contain any proper characteristic subgroups. Also, all non-identity elements 
are G — conjugate and so have the same finite order, which must be a prime p. This 
means Syl,.(NV) = {1} ifr 4 p, so N isa p-group. By Lemma 4.3.2, N has a non- 
trivial center, and so is abelian. Since all its elements have order p, it is elementary 
abelian. Since it is regular, the statements about |X| follow. 


Let us discuss a few examples. 


Example 28 (i) The groups FinSym(X) are k-fold transitive for every natural num- 
ber k not exceeding |X|. However, an easy induction proof shows that Alt(X) 
is (k — 2)-transitive for each natural number k less than |X|. (Recall that the 
alternating group is by definition a subgroup of the finitary symmetric group.) 

Gi) The group GL(V) has a 2-transitive action on the set P; (V) of all 1-subspaces 

of V. 
This set is called the set of projective points of V; the set P2(V) of all 
2-dimensional spaces is called the set of projective lines. Incidence between 
P\(V) and elements of P2(V) is the relation of containment of a 1-subspace ina 
2-subspace of V. Then PG(V) := (P(V), P2(V)) is an incidence system of 
points and lines with this axiom: 


Linear Space Axiom. Any two points are incident with a unique line. 


If dimV = 2, G/(V) is doubly transitive on the set P)(V), the projective 
line. If the ground field for the vector space V is a finite field containing g 
elements, then there are g + | projective points being permuted. The group 
of actions induced—that is GL(V)/K where K is the kernel of the action 
homomorphism—is the group PGL(2, q), and it is 3-transitive on P| (V). 

If dimV = 3, it is true that any two distinct 2-subspaces of V meet in a 
1-subspace. Thus we have the additional property: 


Dual linear Axiom Any two distinct lines meet at a point. 


Any system of points and lines, satisfying both the Linear Space Axiom and 
the Dual Linear Axiom is called a projective plane. Again, if the ground field 
of V is a finite field of g elements then we see that 


IP\(V)| =1+q+q?=|P2(V)| 


and each projective point lies on g + | projective lines, and, dually, each pro- 
jective line is incident with exactly g + 1 projective points. 

In the case g = 2 (which means that the field is Z/(2), integers mod 2), we get 
the system of seven points and seven lines which we have met as the second 
example in Sect.4.3.5. We saw there that the full automorphism group of the 
projective plane of order 2 had order 168. 
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4.4.4 Permutation Characters 


The results in this section only make sense for actions on a finite set X. 

Suppose f : G — Sym(X) is a group action on a finite set X. Then f(G) isa 
finite group of permutations of X. Without loss of generality we assume G itself is 
finite. (We do this, so that some sums taken over the elements of G are finite. But 
this is no drawback, for the assertions we wish to obtain can in general be applied to 
f(G) as a faithfully acting finite permutation group.) 

Associated with the finite action of G is the function 


T:G>Z 
which maps each element g of G to the number 7(g) of fixed points of g—that is, 
the number of letters that g leaves fixed. This function is called the permutation 


character of the (finite) action f , and, of course, is completely determined by f. The 
usefulness of this function is displayed in the following: 


Theorem 4.4.7 (Burnside’s Lemma) 


(i) The number of orbits with which G acts on X is the average number of letters 
fixed by the group elements: 


C/G) >) gt: 
In particular, G is transitive, if and only if the average value of 7 over the group 
is 1. 
(ii) The number of orbits of G on X“™ is 
A/G) >) eg DOM — V+ (g) —k + I). 
(iii) In particular, if G acts 2-transitively, then 
(1/|G)) diag)? = 2. 
Also, if G is 3-transitive, 


C/G) >) eg tO)? = 5. 


Proof (i) Count pairs (x, g) € X x G with x7 = x. One way is to first choose x € X 
and then find g € G,. This count yields 


Divex! Ox! = Dcoppits| GxOxl =D orpitg! Gl = IGI no. of orbits. 
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On the other hand we can count these pairs by choosing the group element g first, 
then x € X in 7(g) ways. Thus 


|G|- no. of orbits = >° __7(g). 


geG 


(ii) The number of ordered k-tuples of distinct letters left fixed by element g is 
m(g) - (7(g) — 1)--- (r(g) —&k + 1) (since such a k-tuple must be selected from the 
fixed point set of g). We then simply apply part (i) with X replaced by X*, 

(iii) If G is 2-transitive then the average value of 7(a — 1) is 1. But as such a 
group is also transitive, the average value of 7 is also 1. Thus the average value of 
nr is 


the average value of(7(m — 1) +7) =1+1=2. 


Similarly, if G is 3-transitive, 


C/G) >) GAO = AIG Dg DE) — DEW —2) 


2 —, 
e3 e637 2D eG" 
=14+2-3-2-1=5. 


The student interested in pursuing various topics brought up in Sects. 4.3 and 4.4 
and the exercises below should consult Permutation Groups by Peter Cameron [9]. 
This excellent book is not only thoroughly up to date, but is crammed with fascinating 
side-topics. 


4.5 Exercises 


4.5.1 Exercises for Sects. 4.1—4.2 


1. Show that in any finite group G of even order, there are an odd number of invo- 
lutions. [Hint: Choose t € 7, where J = inv(G), the set of all involutions in 
G. Then J partitions as {t} + (t) + A(t), where ['(t) := Cg (t) NT — {t} and 
A(t) := t© — Cg(t). Show why the last two sets have even cardinality. ] 

2. (Herstein) Show that if exactly one-half of the elements of a finite group are 
involutions, then G is a generalized dihedral group of order 2n, n odd (recall 
the definition in Example 26(6) on p. 98 of a “generalized dihedral group”). 
[Hint: By Exercise (1) in this section, the number of involutions is an odd number 
equal to one-half the group order, and by Sylow’s theorem all involutions are 
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conjugate, forcing Cg(t) = (t). The result follows from any Transfer Theorem 
such as Thompson’s.*] 
3. This has several parts: 


(a) 


(b) 


(c) 
(d) 


(e 


wm 


Show that if p is a prime, then any group of order p? is abelian. [Hint: It 
must have a non-trivial center, and the group mod its center cannot be a 
non-trivial cyclic group. ] 

Let G be a finite group of order divisible by the prime p. Show that the 
number of subgroups of G or order p is congruent to one modulo p. [Hint: 
Make the same sort of decomposition of the class of groups of order p as 
was done for groups of order 2 in Exercise (1) in this section. You need to 
know something about the number of subgroups of order p in a group of 
order p*.] 

If prime number p divides |G| < 00, show that the number of elements of 
order p in G is congruent to —1 modulo p. 

(McKay’s proof of parts (b) and (c) of this exercise.) Let G be a finite group 
of order divisible by the prime p. Let S be the set of p-tuples (91, 92, ..-, Jp) 
of elements of G such that [[g; = 1. 

i. Show that S is invariant under the action of 7, a cyclic shift which takes 
(91, 92,-+++ 9p) tO (Gp, 91, +++» Gp—1): 

ii, As H = (a) ~ Z, acts on S, suppose there are a H-orbits of length | 
and b H-orbits of length p. Show that |G|?~! = |S| = a + bp. As p 
divides the order of G conclude that p divides a. 

ili. Show that a € 0. Conclude that a is the number of elements of order 
one or order p in G, and is a non-zero multiple of p. 

iv. Let P;,..., Py be a full listing of the subgroups of order p in G. Show 
that | U P;| = a = Omodp, and deduce that k = Imod p. 

Suppose p* divides the order of G and P is a subgroup of order p‘—!. Show 
that p divides the index [Vg(P) : P]. [Hint: Set VN := NG(P) and suppose, 
by way of contradiction, that p does not divide [NV : P]. Then P is the 
unique subgroup of N of order p*~! and so is characteristic in N. As p* 
divides |G|, we see that p divides the index [G : N]. Now P acts on the 
left cosets of N in G by left multiplication, with N comprising a P-orbit 
of length one. Show that there are at least p such orbits of length one so 
that there is a left coset bN ~¢ N, with PbN = DN. Then b-! Pb < WN 
forcing b~'Pb = P from the uniqueness of its order. But then b € N,a 
contradiction. ] 

Suppose the prime power p* divides the order of the finite group G. Define 
a p-chain of length k to be a chain of subgroups of G, 


Pi < Py <--- < Px, 


4 Almost certainly this is the elementary proof that Herstein had in mind rather than the proofs in 
several longer papers on this subject which have appeared in the AMA MONTHLY. 
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where P; is a subgroup of G of order p/,j = 1,...,k. Show that the 
number of p-chains of G of length k is congruent to one modulo p. [Hint: 
Proceed by induction on k. The case k = | is part 4 (d) of this Exercise. One 
can assume the number m of p-chains of length k — 1 is congruent to 1 mod 
p. and these chains of length k — | partition the chains of length k. If c isa 
chain of length k — | with terminal member Py_1, then any extension of c 
to a chain of length k is obtained by adjoining a group P, of order p* in the 
normalizer NG(P,_,). The number of such candidates is thus the number 
of subgroups of order p in NG(Pr—1)/ Pr-1. Part 5 of this Exercise shows 
that p divides this group order, and so by part 4 (d), the number of Z,’s in 
the factor N/P is congruent to one mod p. Show that the conclusion is now 
forced. | 

From the previous result show that if p* divides |G|, for k > 0, then the 
number of subgroups of order p* is congruent to 1 mod p. 


wm 


(g 


[Remark: Note that parts 3-7 of this Exercise did not require either the Sylow 
theorems of even the weaker Cauchy’s Theorem that asserts that G contains an 
element of prime order p if p divides |G|. Indeed, these exercises essentially 
reprove them. Exercise (1) in this section and part 2 of this Exercise, on the other 
hand, required Cauchy’s theorem in order to have Z,’s to discuss. ] 


4. Let G be a group of order p”q, where p and q are primes andn > 1. Show 
that G is not a simple group. [Hint: We may assume that there are at least two 
p-sylow subgroups. Select a pair (P;, P2) of distinct p-Sylow subgroups so that 
their intersection N = P; M P> is as large as possible. Invoke Lemma 4.3.2, part 
(iii), to conclude that NG (NV) contains distinct p-Sylow subgroups, and hence has 
order divisible by g. Choose Q, € Syl, (Ne (N)). Then G = P; Q, and so 


(M°) = (M"), 


The left side of this equation is a normal subgroup of G. But the right side of 
the presented equation is properly contained in P;. Thus G is not simple unless 
M = 1. In the latter case, the g elements of Syl,,(G) pairwise intersect at the 
identity. Now count the number of elements of order dividing q.] 


4.5.2 Exercises Involving Sects. 4.3 and 4.4 


1. Show that any group of order fifteen is cyclic. [Hint: Use Sylow’s Theorem to 
show that the group is the direct product of two cyclic p-sylow subgroups. ] 

2. Show that any group of order 30 has a characteristic cyclic subgroup of order 
15. 

3. Show that any group N which admits a group of automorphisms which is 
2-transitive on its non-identity elements is an elementary abelian 2-group or a 
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cyclic group of order 3. 
Extend this result by showing that if there is a group of automorphisms of NV 
which is 3-transitive on the non-identity elements of NV, then |N| < 4. 

4. Show that if G faithfully acts as a k-fold transitive group on a set X, and Gy 
is a simple group, then (as k is at least 1) any proper normal subgroup of G is 
regular on X. 

Using a basic lemma of this chapter, show that if | # N is a normal subgroup 
of G, then G is the semidirect product NG, (here G, 1 N = 1 so G, isa 
complement to N in G). Moreover, Gx acts (k — 1)-fold transitively on the non- 
identity elements of NV. 

Conclude that under these circumstances, G can be at most 4-fold transitive on 
X, and then only in the case that |X| = 4 and G = Sym(4). 

5. Show that the group Alt(5) acts doubly transitively on its six 5-sylow subgroups. 

6. Show that if N is a non-trivial normal subgroup of G = Alt(5), then either (i) N 
has order divisible by 30 or (ii) N normalizes and hence centralizes each 5-sylow 
subgroup. [Hint: Use the fact that G acts primitively on both 5 and 6 letters.] 

7. Conclude that possibility (11) of the previous exercise is impossible, by showing 
that a 5-sylow subgroup of Alt(5) is its own centralizer. 

8. Using Exercises (5—7) in this section (the three preceding exercises), show in 
just a few steps, that Alt(5) is a simple group. 

9. If G is ak-transitive group, and G,,,x,_, is simple, then, provided k > 3, and 
|X| > 4, G must be a simple group. So in this case, the property of simplicity is 
conferred from the simplicity of a subgroup. (This often happens for primitive 
groups of higher permutation rank, but the situation is not easy to generalize.) 
[Hint: Use Exercises (4-8) above. ] 

10. Provide a proof of the following result: 


Lemma 4.5.1 Suppose G is a group acting faithfully and primitively on a set X. 
Assume further that: 


(i) Ais anormal subgroup of Gx, and 
(ii) G is generated by the set of conjugates A° = {A9|g € G}. 


Then for any non-trivial normal subgroup N of G, we have G = AN. 


[Hint: Use Corollary 4.4.2 and get G = G,.N and then use (i) and (ii)]. 


11. Organize the last few exercises into a proof that all alternating groups Alt(7) are 
simple forn > 5. 

12. (Witt’s Trick) Suppose K is a k-transitive group acting faithfully on the set X, 
where k > 2. 


(a) Show that if H := Ka, for some letter a in X, then K = H + AtH for 
some involution f in K. 
(b) Suppose oo is not a letter in X, set X’ := X U {oo}, and set S := Sym(X’). 
Suppose now that S$ contains an involution s such that 
i. s transposes oo and a, 
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li. s normalizes H—i.e.sHs = H, and 
iii. o(st) = 3, that is, (st)? = 1, while st # I. 


Then show that the subset Y = K UKsK is asubgroup of S. [Hint: One need only 
show that Y is closed under multiplication, and this is equivalent to the assertion 
that sKs C Y. Decompose K; use the fact that sH = Hss and that sts = tst.] 


13. Show that in the previous Exercise Y is triply transitive on X’. 
14. Suppose G is a simple group with an involution ¢ such that the centralizer Cg (t) 
is a dihedral group of order 8. Show the following: 


(a) The involution ¢ is a central involution—that is, ¢ is in the center of at least 
one 2-Sylow subgroup of G, or, equivalently, |t| is an odd number. Thus 
any 2-Sylow subgroup of G is just Dg. 

(b) There are exactly two conjugacy classes of fours groups—say ve and ve 
where V; and V3 are the unique two fours subgroups of a 2-Sylow subgroup 
DofG. 

(c) There is just one conjugacy class of involutions in G. 

(d) Each fours group V; is a TI-group—that is, a subgroup that intersects its 
distinct conjugates only at the identity group. 


[Hints: This is basically an exercise in the use of Sylow’s Theorem (Theorem 
4.3.3), the Burnside Fusion Theorem (Theorem 4.3.7) and the Thompson Transfer 
Theorem (Theorem 4.3.8). 


Part (a): Take D = CcG(t) = Ds. Then its center is just (ft), so NG(D) < 
Cg(t) = D. 


Part (b): Let V;, i = 1, 2, be the two unique fours subgroups of D € Syl(G). Since 
each V; is normal in D, the Burnside Fusion theorem tells us that they could be 
conjugate in G only if they were already conjugate in NG(D) = D, where it is 
patently untrue. 


Part (c): Let D = Vj V2 = C,(t) where Vj N V2 = (t). Since D ~ Dg, it also 
contains a normal cyclic group C of order 4. If no non-central involution of D is 
conjugate to the central involution ¢, then a contradiction to the simplicity of G is 
obtained from the Thompson Transfer Theorem, using C as the subgroup of index 
2in D.Ift? mM DC Vj, the same argument, using V in place of C, again yields 
a contradiction. Thus all involutions in D are conjugate to f. 


Part (d): Clearly two conjugates of V; could at best meet at a central involution, 
of which, by Part (c) there is only one class, t@. In that case they would com- 
prise two fours subgroups of the same 2-Sylow subgroup (the centralizer of their 
intersection), and hence could not be conjugate by Part (b).] 


Remark A nice little geometry lurks in the above exercise. The points P are one class 
of fours groups—say Vi; and the lines £ are the other class Ve A line is declared 
to be incident with a point, if and only if the corresponding fours groups normalize 
each other—.e. they lie ina common 2-Sylow subgroup. With the information above, 
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the reader can easily prove that NG(V;) ~ Sym(4), which tells us that each line is 
incident with just three points and each point lies on exactly three lines. Here are two 
examples: 


(a) |G| = 168, (P, L) is the projective plane of order 2; 7 points, 7 lines. 

(b) |G| = 360, (P, L) is the generalized quadrangle of order (2, 2); 15 points, 15 
lines. 
Are these the only possibilities? 


15. This exercise is really an optional student project. First show that if G if a 4-fold 
transitive group of permutations of the finite set X, then 


C/G) > gt = 15. 


The object of the project would be to show that in general, if G is k-fold transitive 
on X then 


CGD gt = Be: 


where Bx is the total number of partitions of a k-set (also known as the k-th Bell 
number). An analysis of the formulae involves one in expressing the number 
of sequences (with possible repetitions) of k elements from a set X in terms 
of ordered sequences with pair-wise distinct entries. The latter are expressed in 
terms of the so-called “falling factorials” x(x — 1)(x — 2)--- (a — €+ 1) in the 
variable x = |X|. The coefficients involve Stirling numbers and the Bell number 
can be reached by known identities. But a more direct argument can be made by 
sorting the general k-tuples over X into those having various sorts of repetitions; 
this is where the partitions of k come in. 


Chapter 5 
Normal Structure of Groups 


Abstract The Jordan-Holder Theorem for Artinian Groups is a simple application 
of the poset-theoretic Jordan-Hélder Theorem expounded in Chap. 2. A discussion 
of commutator identities is exploited in defining the derived series and solvability 
as well as in defining the upper and lower central series and nilpotence. The Schur- 
Zassenhaus Theorem for finite groups ends the chapter. In the exercises, one will 
encounter the concept of normally-closed families of subgroups of a group G, which 
gives rise to several well-known characteristic subgroups, such as O,,(G), the torsion 
subgroup, and (when G is finite) the Fitting subgroup. Some further challenges appear 
in the exercises. 


5.1 The Jordan-H6élder Theorem for Artinian Groups 
5.1.1 Subnormality 


As defined in Chap. 4, p. 99, a group is simple if and only if its only proper normal 
subgroup is the identity subgroup. (In particular, the identity group is not considered 
to be a simple group.) 

A subgroup # is said to be subnormal in G if and only if there exists a finite chain 
of subgroups 


H=NoIN,4---IN=G. 


(Recall that this means each group N; is normal in its successor N;+1, but is not 
necessarily normal in the groups further up the chain.) We express the assertion that 
His subnormal in G by writing 


HAAG. 
Let SN (G) be the collection of all subnormal subgroups of G. This can be viewed 
as a poset in two different ways: (1) First we can say that A is “less than or equal 
to B” if and only if A is a subgroup of B, that is, SN has the inherited structure 
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of an induced poset of S(G), the poset of all subgroups of G with respect to the 
containment relation. (2) We also have the option of saying that A “is less than B” 
if and only if A is subnormal in B. The second way of defining a poset on SN 
would seem to be more conservative than the former. But actually the two notions 
are exactly the same, as can be concluded from the following Lemma. 


Lemma 5.1.1 (Basic Lemma on Subnormality) The following hold: 


(i) IfA SSB and BAG, then ASG. 
(ii) If A<<G and H is any subgroup of G, then AN H 3 SH. 
(iii) If A is a subnormal subgroup of G, then A is subnormal in any subgroup that 
contains it. 
(iv) If A and B are subnormal subgroups of G, then so is AN B. 


Proof (i) This is immediate from the definition of subnormality: By assumption 
there are two finite ascending chains of subgroups, one running from A to B, and 
one running from B to G, each member of the chain is normal in its successor in the 
chain. Clearly concatenating the two chains produces a chain of subgroups from A 
to G, each member of the chain normal in the member immediately above it. So by 
definition, A is subnormal in G. 

(ii) If A I SG one has a finite subnormal chain A = Ap J A; d--- Aj i=G. 
Suppose # is any subgroup of G. It is then no trouble to see that AN HI AINA SA 
A2.NH<---Aj;OH =GOH = H, isasubnormal series from AN H to H. 

(iii) This assertion follows from Part (ii) where H contains A. 

(iv) By Part (ii), AN B is subnormal in B. Since B is subnormal in G, the conclusion 
follows by an application of Part (1). 


At this point the student should be able to provide a proof that the two methods 
(1) and (2) of imposing a partial order on the subnormal subgroups lead to the same 
poset. We simply denote it (SA/(G), <) (or simply SNV if G is understood) since it 
is an induced poset of the poset of all subgroups of G. From Lemma 5.1.1, Part (iv), 
we see that 


SN(G) is a lower semilattice. 
Finally it follows from the second or third isomorphism theorem, that if 


If A and B are distinct maximal normal subgroups of G = (A, B) = AB = BA, 
then G/A ~ B/(AN B). Thus SN(G) is a semimodular lower semilattice and 
the mapping 

pp: Cov(SN) — simple groups 


(which records for every cover X <1 Y where X a maximal normal subgroup of Y, 
the isomorphism type of the simple group Y / X) is “semimodular” in the sense of 
Theorem 2.5.2, Sect.2.5.3. 
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According to the general Jordan-Holder theory (advanced in Sect. 2.5.3, Theorem 
2.5.2), can be extended to an interval measure 


Asyv > M. 


from all algebraic intervals of SN into the additive monoid M of all multisets 
of isomorphism classes of simple groups—that is, the free additive semigroup of 
all formal non-negative integer linear combinations of isomorphism classes [G] of 
simple groups: 


Dusimp“tailGl. of finite support. 


If there is a finite unrefinable ascending chain in SN from 1 to G, then such a 
chain of subnormal subgroups is called a composition series of G. The multiset of 
the isomorphism classes of the simple groups A;+1/A; appearing in such a compo- 
sition series 1 J A; J ---< A, = Gis called the multiset of composition factors. 
The Theorem 2.5.2 states that these are the same multisets for any two composition 
series—the “measure” j1([1, G]). 

If you have been alert, you will notice that no assumption that the involved groups 
are finite has been made. The composition factors may be infinite simple groups. 

But of course, in applying the Jordan-H6lder theorem, one is interested in pinning 
down a class of algebraic intervals. This is easy in the case of a finite group, for 
there, all intervals of SM(G) are algebraic. How does one approach this for infinite 
groups? The problem is that one can have an infinitely descending chain of subnormal 
subgroups, and also one can have an infinitely ascending chain of proper normal 
subgroups (just think of an abelian group with such an ascending chain). 

However one can select certain subposets of SN which will provide us with many 
of these algebraic intervals: Suppose from any S\V(G) we extract the set SA* of all 
subnormal subgroups for which any downward chain of subgroups each normally 
containing its successor, must terminate in a finite number of steps. There are three 
other descriptions of this subcollection SN*, of subnormal subgroups: (i) they are 
the subnormal subgroups of G which belong to a finite unrefinable subnormal chain 
extended downward to the identity. (ii) They are the elements M of SN for which 
[1, M] is an algebraic interval of SM’—see Chap.2, Sect.2.3.2. (iii) This is the 
special class of subgroups amenable to the Jordan-Holder Theorem—the subnormal 
subgroups of G which themselves possess a finite composition series. We call the 
elements of SN™* the Artinian subgroups. They are closed under the operation of 
taking pairwise meets in SN and so S\* is clearly a lower semilattice, all of whose 
intervals are algebraic. 

AS a special case, we have: 


Theorem 5.1.2 (The Jordan-Holder Theorem for general groups) Let SN™* be the 
class of all Artinian subnormal subgroups of a group G. Then for any H in SN, the 
list (with multiplicities) of all isomorphism classes of simple groups which appear 
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as a cheif factor in any finite unrefinable subnormal chain of H is independent of 
the particular choice of that chain. 


Corollary 5.1.3 Any two composition series (unrefinable subnormal series) of a 
finite group recreates through its chief factors the same list of isomorphism classes 
of finite simple groups, with mutiplicities respected. 


5.2 Commutators 


Let G be a group. A commutator in G is an element of the form x~!y~!xy (called 


a commutator) and is denoted [x, y]. A triple commutator [x, y, z] is an element of 
shape [[x, y], z], which the student may happily work out to be 


[x, y)-'21 Lx, ylz = yx yxz tal yt xyz. 


If we write u* := x~!ux, the result of conjugation of u by x, then it is an exercise 
to verify the following identities, for elements x, y, and z in any group G: 


eivl=tyal": (5.1) 
[xy, z] = [x, zP Ly, 2] = [x, zIlx, z, ylly,z]. 5.2) 
P32) be alls, yl = Lx, zIlx, yllx, y,z]. 5.3) 
l= [yy 2Piye Sale ys 454) 
[x, y, ZILy, z, x][z, x, y] = Ly, x][z, x]lz, y}* x 
Ie Pe | Pea Pe alin ce | eo a (5.5) 


A commutator need not be a very typical element in a group. For example, in an 
abelian group, all commutators are equal to the identity. 

If A, B and C are subgroups of a group G, then [A, B] is the subgroup of G 
generated by all commutators of the form a~'b~'ab = [a, b] as (a, b) ranges over 
A x B. Similarly we write [A, B, C] := [[A, B], C]. 


Theorem 5.2.1 (Basic Identities on Commutators of Subgroups) Suppose A, B and 
C are subgroups of a group G. 


(i) A centralizes B if and only if [A, B] = 1. 
(ii) [A, B] = [B, A]. 
(iii) B always normalizes [A, B]. 
(iv) A normalizes B if and only if [A, B] < B. In that case, [A, B] is a normal 
subgroup of B. 
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(v) If f : G — H isa group homomorphism, then [ f(A), f(B)] = f(A, B)). 
(vi) (The Three Subgroups Lemma of Phillip Hall) [f at least two of the three sub- 
groups [A, B, C],[B, C, A] and [C, A, B] are trivial, then so is the third. 


Proof Part(i) A centralizes B if and only if every commutator of shape [a, b], (a, b) € 
A x B, is the identity. The result follows from this. 

Part (ii) A subgroup generated by set X is also the subgroup generated by X~!. 
Apply (5.1). 

Part (iii) If (a,b, b}) € A x B x B, then it follows from (5.3) that 


[a,b]?! = [a, bi] 7! [a, bby] € [A, BI. 


Part (iv) The following statements are clearly equivalent: 


1. A normalizes B 
2. [a,b] <¢ Bforall(a,b)e AxB 
3. [A, B] < B. 


The normality assertion follows from part (iii). 

Part (v) This is obvious. 

Part (vi) Without loss of generality assume that [A, B, C] and [B, C, A] are the 
identity subgroup. Then all commutators [a, b, c] and [b, c, a] are the identity, no 
matter how the triplet (a, b,c) is chosen in A x B x C. Now by Eq.(5.4), any 
commutator [c, a, b] is the identity, for any (a, b,c) € A x B x C. This proves that 
[C, A, B] has only the identity element for a generator. 


Corollary 5.2.2. Suppose again that A and B are subgroups of a group G. 


(i) If A normalizes B and centralizes [A, B], then A/C,(B) is abelian. 
In particular, if A < Aut(B) and A centralizes [A, B] := (aa? \(a, by € 
A x B), then A is an abelian group. 

(ii) If A is a group of linear transformations of the vector space V, centralizing a 
subspace W as well as its factor space V/W, then A is an abelian group.' 


Proof All these results are versions of the following: By the Three Subgroups 
Lemma, [A, B, A] = | implies [A, A, B] = 1. That is the proof. 


Of course anytime someone has some machinery, some mathematician wants to 
iterate it. We have defined the commutator of two subgroups in such manner that the 
result is also a subgroup. Thus inductively we may define a multiple commutator 
of subgroups in this way: Suppose (Aj, Az, ...) is a (possibly infinite) sequence of 
subgroups of a group G. Then for any natural number k, we define 


[Ai,...Axyi] := [[A1,... Ag], Ac+i]. 


'This seems to be a frequently rediscovered result. 
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5.3 The Derived Series and Solvability 


From the previous section, we see that the subgroup [G, G] is a subgroup which 


(i) is mapped into itself by any endomorphism f : G — G (a consequence of The- 
orem 5.2.1(V)). (This condition is called fully invariant. Since inner automor- 
phisms are automorphisms which are in turn endomorphisms, “fully invariant” 
implies “characteristic” implies “normal”.) 

(ii) yields an abelian factor, G/[G, G] (let f : G — G/[G, G] and apply parts (i) 
and (iii) of Theorem 5.2.1). Moreover, if G/N is abelian, then [G, G] < N, so 
[G, G] is the smallest normal subgroup of G, which possesses an abelian factor 
(see Chap. 3). 


The group [G, G] is called the commutator subgroup or derived subgroup of G 
and is often denoted G’. 

Since we have formed the group G’ := [G, G], we may iterate this definition to 
obtain the second derived group, 


G’ :=[IG, GI,[G, GI. 
Once on to a good thing, why not set G’ = G“) and G” = G® and define 
G&D .— [IG® Gg) 
for all natural numbers k? The resulting sequence 
G > GH > G Sy ieee : 


is called the derived series of G. Since each G™ is characteristic in its successor 
which in turn is characteristic in its successor, etc., we see the following: 


Every G™ is characteristic in G. 


Note also that 
Git) = (GY), (5.6) 
A group G is said to be solvable if and only if the derived series 
CSG SG? a5, 
eventually becomes the identity subgroup after a finite number of steps.” The smallest 


positive integer k such that G“ = 1 is called the derived length of G. Thus a group 
is abelian if and only if it has derived length 1. 


2In this case, for once, the name “solvable” is not accidental. The reason will be explained in the 
Chap. 11 on Galois Theory. 
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Theorem 5.3.1 (i) If H < Gand A <G, then[A, H] < [A, G]. In particular 
[H, H] <[G, H] <[G, Gl]. 


(ii) If H < G, then H® < G™. Thus, if G is solvable, so is H. 
(iii) If f : G > H isa group homomorphism, then 


f(G®) = (Ff (@)® < H®. 


(iv) If N <G, then G is solvable if and only if both N and G/N are solvable. 
(v) For a direct product, 
(A x B)® = A® x B®, 


(vi) The class of all solvable groups is closed under taking finite direct products and 
taking subgroups and hence is closed under taking finite subdirect products. In 
particular, if G is a finite group, there is a normal subgroup N of G minimal 
among normal subgroups with respect to the property that G/N is solvable. 
(Clearly this normal subgroup is characteristic; it is called the solvable resid- 
ual.) 


Proof Part (i) follows from the containment, 
{[a, h]|(a,h) € A x H} C {[a, g]|(a, g) € A x G}. 
Part (ii) By Part (i), H’ < G’. If we have H&—)) < G&)), then 
HO = (He), H&-)] < (een. relma - GH, 


utilizing H 1) and G“—) in the roles of H and G in the last sentence of Part (i). 
The result now follows by induction. 

Part (iii) The equal sign is from Theorem 5.2.1, Part (v); the subgroup relation is 
from Part (i) above of this Theorem. 

Part (iv) Suppose G is solvable. Then so are N and G/N by Parts (1) and (iii). 
Now assume, G/N and WN are solvable. Then there exists natural numbers k and /, 
such that (G/N) = 1 and N = 1. Then by Part (iii), the former equation yields 


GYN/N =1lorG™ <N. 
Then 
G&D = (GM) < NY <1, 
by Part (ii), so G is solvable. 


Part (v) If the subgroups A and B of G centralize each other, it follows from 
Eqs. (5.2) or (5.3) that [ab, a’'b’] = [a, a’][b, b’] for all (a, b), (a’, b') € Ax B. This 
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shows (A x B) < A’ x B’. The reverse containment follows from Part (i). Thus the 
conclusion holds for k = 1. One need only iterate this result sufficiently many times 
to get the stated result for all k. 

Part (vi) This is a simple application of the other parts. A subgroup of a solvable 
group is solvable because of the inequality of Part (ii). A finite direct product of 
solvable groups is solvable by Part (v). Thus, from the definition of subdirect product, 
a finite subdirect product of solvable groups is solvable. The last part follows from 
the remarks in Sect. 3.6.2. 


Remark Suppose G1, G2,... is a countably infinite sequence of finite groups for 
which G; has derived length k (It is a fact that a group of derived length k exists 
for every positive integer k: perhaps one of those very rare applications of wreathed 
products.) Then the direct sum Den Gk = G; @--- does not have a finite derived 
length and so is not solvable. Thus solvable groups are not closed under infinite direct 
products. 


Recall that a finite subnormal series for G is a chain 
G=MENCN-1-->Mm=1, 


where each Nj+1 is a normal subgroup of its predecessor, N;, j = 0,...k — 1, 
k is a natural number. (Of course what we have written is actually a descending 
subnormal series. We could just as well have written it backwards as an ascending 
series.) Recall also that a unrefinable subnormal series of finite length is called a 
composition series, and when such a thing exists, the factors N;/Nj+1 are simple 
groups called the composition factors of G. That the list of composition factors that 
appears is the same for all composition series, was the substance of the Jordan-Hélder 
Theorem for groups. 
These notions make their reappearance in the following: 


Corollary 5.3.2. (i) A group G is solvable of derived length at most k if and only 
if it possesses a subnormal series 


G=NEPNECNE-:-CN=1. 


such that each factor Nj/Nj41. j =0,...k —1, is abelian. 

(ii) A finite group is solvable if and only if it possesses a composition series with all 
composition factors cyclic of prime order (of course, the prime may depend on 
the particular factor taken). 


Proof Part (i) If G is solvable, the derived series is a subnormal series for which all 
factor groups formed by successive members of the series are abelian. Conversely, 
if the subnormal series is as given, then No/N abelian implies [G, G] = G < N. 
Inductively, if GY? < Nj, then N;/Nj+1 abelian implies GY*) < [N;.Nj] < 
Nj41. So by induction, GU*) < Nj4; for all j = 0,1,...k — 1. Thus Gh < 
Nx = 1, and G is solvable. 
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Part (ii) If G possesses a composition series with all composition factors cyclic of 
prime order, G is solvable by Part (i). On the other hand, if G is finite and solvable 
it possesses a subnormal series 


G=No>ND-.- PM =1, 


with each factor N;/Nj+1 a finite abelian group. But in any finite abelian group 
(where all subgroups are normal) we can form a composition series by choosing 
a maximal subgroup M, of A, a maximal subgroup M2 of M1, and so forth. Each 
factor will then be cyclic of prime order. Thus each interval NV; & Nj+1 can be refined 
to a chain with successive factors of prime order. The result is a refinement of the 
original subnormal series to a composition series with all composition factors cyclic 
of prime order. 


There are many very interesting facts about solvable groups, especially finite ones, 
which unfortunately we do not have space to reveal. Two very interesting topics are 
these: 


e Phillip Hall’s theory of Sylow Systems. In a finite solvable group there is a repre- 
sentative S; of Sylp,(G), the pj-Sylow subgroups of G such that G = $1 S2--- Sm, 
and the S; are permutable in pairs—that is $;S; = S;S; where p1, ... Pm is the list 
of all primes dividing the order of G. Thus in any solvable group of order 120, for 
example, there must exist a subgroup of order 15. (This is not true, for example, of 
the non-solvable subgroup Sym(5) of that order.) A full account of Hall systems 
can be found in [22]. 

e The theory of formations. Basically, a formation is an isomorphism-closed class 
of finite solvable groups closed under finite subdirect products. Then each finite 
group G contains a characteristic subgroup G¢ which is unique in being minimal 
among all normal subgroups N of G which yield a factor G/N belonging to F. 
(Gz is called the F-residual of G.) Whenever G is a finite group, let D(G) be the 
intersection of all the maximal subgroups of G. This is a characteristic subgroup 
which we shall meet shortly, called the Frattini subgroup of G. A formation is said 
to be saturated, if for every solvable finite group G, G/D(G) € F implies G € F. 
If F is saturated, then there exists a special class of subgroups, the F-subgroups, 
subgroups in F described only by their embedding in G, which form a conjugacy 
class. Finite abelian groups comprise a formation but are not a saturated formation. 
A class of groups considered in the next section are the nilpotent groups and they 
do form a saturated formation. As a result one gets a theorem like this: “Suppose 
X and Y are two nilpotent subgroups of a finite solvable group G, each of which 
is its own normalizer in G. Then X and Y are conjugate subgroups. Moreover, 
G must contain such groups.”* A bit surprising. (The term ‘nilpotent group’ is 
defined in the following section.) 


3This conjugacy class of subgroups was first announced in a paper by Roger Carter; they are called 
the class of “Carter subgroups” of G. 
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5.4 Central Series and Nilpotent Groups 


5.4.1 The Upper and Lower Central Series 


Let G be a fixed group. Set yo(G) = G, y1(G) = [G, G], and inductively define 
Vk+1(G) -= [7x(G), G], 
for all positive integers k. Then the descending sequence of subgroups 
G=70(G) 2 1(G) 2 y(G)2-:-, 
is called the lower central series of the group G. Note that 
VK(G) = [G, G,...G] (with & arguments). 


Since any endomorphism f : G — G, maps the arguments of the above multiple 
commutator into themselves, the endomorphism f maps each member of the lower 
central series into itself. A similar argument for homomorphisms allows us to state 
the following elementary result: 


Lemma 5.4.1 (i) Eachmember y;(G) of the lower central series is a fully invariant 
subgroup of G. 


(ii) If f : G— H is a homomorphism of groups then 


F(9K(G)) = 1K F(G)). 


(iii) If H is a subgroup of G, then y,(H1) < y(G). 


Proof The proof is an easy exercise. 


A group G is said to be nilpotent if and only if its lower central series terminates 
at the identity subgroup in a finite number of steps. In this case G is said to belong to 
nilpotence class k if and only if k is the smallest positive integer such that y,(G) = lL. 
Thus abelian groups are the groups of nilpotence class 1. 

The following is immediate from Lemma 5.4.1, and the definition of nilpotence. 


Corollary 5.4.2. (i) Every homomorphic image of a group of nilpotence class at 
most k is a group of nilpotence class at most k. 


(ii) Every subgroup of a group of nilpotence class at most k is also nilpotent of 
nilpotence class at most k. 

(iii) Every finite direct product of nilpotent groups of nilpotence class at most k is 
itself nilpotent of nilpotence class at most k. 
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The beginning student should be able to supply immaculate proofs of these asser- 
tions by this time. [Follow the lead of the similar result for solvable groups in the 
previous section; do not hesitate to utilize the commutator identities in proving part 
(iii) of this theorem. ] 

Clearly, the definitions show that 


[1(G), ve(G)] < [1x(G), G] = %Ke41(G), 


so the factor groups (y(G))/(7%+41(G)) are abelian. As a consequence, 
Corollary 5.4.3. Any nilpotent group is solvable. 


The converse fails drastically: Sym(3) is a solvable group of derived length 2. 
But is not nilpotent since its lower central series drops from Sym(3) to its subgroup 
N ~ Z3, and stabilizes at N for the rest of the series. 

Now again fix G. Set Z(G) := 1, the identity subgroup; set Z;(G) = Z(G), the 
center of G (the characteristic subgroup of those elements of G which commute with 
every element of G); and inductively define Z;(G) to be that subgroup satisfying 


Z(G) /Zp-1(G) = Z(G/Zx_1(G)). 


That is, Z,(G) is the inverse image of the center of G/Zz—1(G) under the natural 
homomorphism G — G/Z,_1(G). For example: 


Z2(G) = {z € Gl[z, g] € Z(G), forall g € G}. 


By definition Z;(G) < Zx+1(G), so we obtain an ascending series of character- 
istic subgroups 


1= Z(G) = Z1(G) < 2(G)<---, 


called the upper central series of G. Of course if G has a trivial center, this series 
does not even get off the ground. But, as we shall see below, it proceeds all the way 
to the top precisely when G is nilpotent. 

Suppose now that G has nilpotence class k. This means 7;,(G) = 1 while y,—1(G) 
is non-trivial. Then as 


[yx-1(G), G] = 1%x(G) = 1, 
G centralizes yz_1(G), that is 
YWke-1(G) = Z1(G). 
Now assume for the purposes of induction, that 


Ve-j(G) < Z(G). 
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Then 
[yx-j-1(G), G] S %-j(G) = Z;(G), 
so 
[ye-j-1, G] = {z € G|[z, G] < Z;(G)} = Zj;41(G). 
Thus, by the induction principle, 
Ve-j(G) < Zj(G), forallO <j <k. (5.7) 


Putting 7 = k in (5.7), we see that Z,(G) = yo(G) = G. That is, if the lower 
central series has length k, then the upper central series terminates at G in at most 
k steps. 

Now assume G # | has an upper central series terminating at G in exactly k 
steps—that is, Z,(G) = G while Zx_1(G) is a proper subgroup of G (as G # 1). 
Then, of course G/Z,_| is abelian, and so y1(G) = [G, G] < Zz_-1(G). Now for 
the purpose of an induction argument, assume that 


Vj(G) < Ze-j(G). 


Then 
Ze—j(G)/Zpe—j-1(G) = Z(G/Zx_j-1) 
implies 
[Zp j(G), G] < Ze-j-1(G). 
so 
yj+1(G) := [7j(G), G] < [Ze-j(G), G] < Ze-(j4+1) (©). 
Thus 


j(G) < Zp_j(G), for all0 < j <k. (5.8) 


Thus if 7 = k in (5.8), y(G) < Zo(G) := 1. Thus if the upper central series 
terminates at G in exactly k steps, then the lower central series terminates at the 
identity subgroup in at most k steps. 
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We can assemble these observations as follows: 


Theorem 5.4.4 The following assertions are equivalent: 


I. G belongs to nilpotence class k. (By definition, its lower central series terminates 
at the identity in exactly k steps.) 
2. The upper central series of G terminates at G in exactly k steps. 


There is an interesting local property of nilpotent groups: 


Theorem 5.4.5 Jf G is a nilpotent group, then any proper subgroup H of G is 
properly contained in its normalizer NG(f1). 


Proof If G = 1, there is nothing to prove. Suppose H is a proper subgroup of G. 
Then there is a minimal positive integer j such that Z;(G) is not contained in H. 
Then there is an element z in Z;(G) — H. Then [z, H] < [z, G] < Zj-1(G) < H. 
Thus z normalizes H, but does not lie in it. 


Corollary 5.4.6 Every maximal subgroup of a nilpotent group is normal. 


Remark There are many groups with no maximal subgroups at all. Some of these are 
not even nilpotent. Thus one should not expect a converse to the above Corollary. This 
raises the question, though, whether the property of the conclusion of Theorem 5.4.5 
characterizes nilpotent groups. Again one might expect that this is false for infinite 
groups. However, in the next subsection we shall learn that among finite groups it is 
indeed a characterizing property. 


Here is a classical example of a nilpotent group: Suppose V is an n-dimensional 
vector space over a field F. A chamber is an ascending sequence of subspaces S = 
(Vi, V2, ... Vn—1) where V; has vector space dimension j. (This chain is unrefinable.) 
Now consider the group B(S) of all linear transformations T € GL(V) such that T 
fixes each V; and induces the identity transformation on each |-dimensional factor- 
space V; / V;—1. Rendered as a group of matrices these are upper triangular matrices 
with “1’s” on the diagonal. The reader may verify that the k-th member of the lower 
central series of this group consists of upper triangular matrices still having 1’s on 
the diagonal, but having more and more upper subdiagonals all zero as k increases. 
Eventually one sees that B(S)—") contains only the identity matrix. 


5.4.2 Finite Nilpotent Groups 


Let p be a prime number. As originally defined, a p-group is a group in which 
each element has p-power order. Suppose P is a finite p-group. Then by the Sylow 
Theorem, the order of G cannot be divisible by a prime distinct from p. Thus we 
observe 


A finite group is a p-group if and only if the group has p-power order. 
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Now we have seen before (Lemma 4.3.2, part (ii)), that a non-trivial finite 
p-group must possess a non-trivial center. Thus if P # 1 is a finite p-group, we 
see that Z(P) € 1. Now if Z(P) = P, then P is abelian. Otherwise, P/Z(P) has a 
nontrivial center. Similarly, in general, if P is a finite p-group: 


Either Z,(P) = P or Zx(P) properly lies in Zp41(P). 


As a result, a finite p-group possesses an upper central series of finite length, 
and hence is a nilpotent group. Obviously, a finite direct product of finite p-groups 
(where the prime p is allowed to depend upon the direct factor) is a finite nilpotent 
group. 

Now consider this 


Lemma 5.4.7 /f G is a finite group and S is a p-Sylow subgroup of G, then 
any subgroup H which contains the normalizer of S is itself self-normalizing—i.e. 
NG(A) = A In particular NG(S) is self-normalizing. 


Proof Suppose x is an element of NG(/7). Since P is a p-Sylow subgroup of G, it 
is also a p-Sylow subgroup of NG(H) and H. Since P* € Syl, (4), by Sylow’s 
theorem there exists an element h € H such that P* = P”. Then xh! € NG(P) < 
H whence x € H. (Actually this could be done in one line, by exploiting the “Frattini 
Argument” (Chap. 4; Corollary 4.3.4)). 


Now we have 


Theorem 5.4.8 Let G be a finite group. Then the following statements are equiva- 
lent. 


(i) G is nilpotent. 
(ii) G is isomorphic to the direct product of its various Sylow subgroups. 
(iii) Every Sylow subgroup of G is a normal subgroup. 


Proof ((ii) implies (i)) Finite p-groups are nilpotent, as we have seen, and so too are 
their finite direct products. Thus any finite group which is the direct product of its 
Sylow subgroups is certainly nilpotent. 

((ii) if and only (iii) Now it is easy to see that a group is isomorphic to a direct prod- 
uct of its various p-Sylow subgroups, (p ranging over the prime divisors of the order 
of G) if and only if each Sylow subgroup of G is normal. Being direct factors Sylow 
subgroups of such a direct product are certainly normal. Conversely, if all p-Sylow 
subgroups are normal, the direct product criterion of Chap. 3 (Lemma 3.5.1, Part (iii), 
p. 96) is satisfied, upon considering the orders of the intersections $1 $2 --- S47 Sr41, 
where {S;} is the full (one-element) collection of Sylow subgroups of G (one for each 
prime). 

((i) implies (iii)) So it suffices to show that in a finite nilpotent group, each p-sylow 
subgroup P is anormal subgroup of G. If Nc(P) = G, there is nothing to prove. But 
if NG(P) is a proper subgroup of G, then it is a proper self-normalizing subgroup by 
Lemma 5.4.7. This completely contradicts Theorem 5.4.5 which asserts that every 
proper subgroup of a nilpotent group is properly contained in its normalizer. 
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Fix a group G. A non-generator of G is an element x of G such that whenever x is 
a member of a generating set X—that is, when x € X, and (X) = G, then X — {x} is 
also a generating set: (X —{x}) = G. Thus x is just one of those superfluous elements 
that you never need. Now it is not easy to see from “first principles’”—that is, from 
the defining property alone—that in a finite group, the product of a non-generator 
and a non-generator is again a non-generator. But we have this characterization of a 
non-generator: 


Lemma 5.4.9 Suppose G is a finite group. Let D(G) be the intersection of all 
maximal subgroups of G. Then D(G) is the set of all nongenerators of G. 


Proof Two statements are to be proven: (1) every non-generator lies in D(G), and 
(ii) every element of D(G) is a non-generator. 

(i) Suppose x is a non-generator, and let M be any maximal subgroup of G. Then 
if x is notin M we must have ({x}U M) = G, while (M) = M, against the definition 
of “non-generator”. Thus x lies in every maximal subgroup M and so x € D(G). 

(ii) On the other hand assume that x is an arbitrary element of D(G), and that X 
is a set of generators of G which counts x among its active participants. If X — {x} 
does not generate G, then, as G is finite, there is a maximal subgroup M containing 
H := (X — {x}). But by assumption x € D(G) < M, whence (X) < M,a 
contradiction. Thus always it is the case that if x € X, where X generates G, then 
also X — {x} generates G. Thus x is a non-generator. 


Remark This theorem really belongs to posets. Clearly finiteness can be replaced by 
the only distinctive property used in the proof: that for every subset X of this poset, 
there is an upper bound—that is, an element of the filter P*, which is either the 
1-element of the poset or lies in a maximal member of the poset with the 1-element 
removed. Its natural habitat is therefore lattices with the ascending chain condition. 
(We need upper semi-lattices so that we can take suprema; we need something like 
a lower-semilattice in order to define the Frattini element of the poset as the meet 
of all maximal elements. Finally the ascending chain condition makes sure that the 
maximal elements fully cover all subsets whose supremum is not “1”. So at least 
lattices with the ascending chain condition (ACC) are more than sufficient for this 
result.) 


Theorem 5.4.10 (Global non-generators) If X U D(G) generates a finite group G, 
then X generates G. 


Proof The argument is the same and has a similar poset-generalization. Suppose (X) 
is not G. Then it lies below a maximal subgroup M. By definition D(G) must lie 
in M, and so X U D(G) lies in M, which contradicts the hypothesis that X U D(G) 
generates G. Thus we must abandon the foregoing supposition and conclude that 
(X) =G. 
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The subgroup D(G) is called the Frattini subgroup of G.* It is clearly a charac- 
teristic subgroup of G. In the case of finite groups, something special occurs which 
could be imitated only artificially in the context of posets: 


Theorem 5.4.11 Jn a finite group, the Frattini subgroup D(G) is a characteristic 
nilpotent subgroup of G. 


Proof The “characteristic” part is clear, and has been remarked on earlier. Suppose 
Pe Syl, (D(G)). By the Frattini Argument, G = Ng(P)D(G). It now follows 
from Theorem 5.4.10 that Ng(P) = G and so in particular P is normal in D(G). 
Thus every p-sylow subgroup of D(G) is normal in it. Since it is finite, it is a direct 
product of its Sylow subgroups and so is nilpotent. 


Corollary 5.4.12 (Wielandt) For a finite group G the following are equivalent: 


(i) G is nilpotent. 
(ii) Every maximal subgroup of G is normal. 


(iii) G' < D(G). 


Proof If Mis a maximal subgroup of the finite group G which also happens to be 
normal, then G/M is a simple group with no proper subgroups, and so must be a 
cyclic group of order p, for some prime p. Thus if all maximal subgroups are normal, 
the commutator subgroup lies in each maximal subgroup and so lies in the Frattini 
subgroup D(G). On the other hand, if G/D(G) is abelian, every subgroup above 
D(G) is normal; so in particular, all maximal subgroups are normal. Thus Parts (ii) 
and (iii) are equivalent. 

But if M is a maximal subgroup of a nilpotent group, M properly lies in its 
normalizer in G, and so NG(M) = G. Thus the assertion of Part (i) implies Parts (ii) 
and (iii). 

So it remains only to show that if every maximal subgroup is normal, then G is 
nilpotent. Since G is finite, we need only show that every Sylow subgroup is normal, 
under these hypotheses. Suppose P is a p-Sylow subgroup of G. Suppose NG(P) 4 
G. Then Nc(P) lies in some maximal subgroup M of G. By Lemma 5.4.7, M is 
self-normalizing. But that contradicts our assumption that every maximal subgroup 
is normal. Thus NG(P) = G for every Sylow subgroup P. Now G is nilpotent by 
Theorem 5.4.8. 


5.5 Coprime Action 


Suppose G is a finite group possessing a normal abelian subgroup A, and let F = 
G/A be the associated factor group. Any transversal of A—that is, a system of coset 
representatives of A in G can be indexed by the elements of the factor group F. Any 
transversal T := {tr| f ¢ F} determines a function 


* In some of the earlier literature one finds D(G) written as “®(G)”. We do not understand fully 
the reason for the change. 
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(called a factor system) by the rule that for all (f, g) € F, 


te tg =tpgar(f,g). 


In some sense, the factor system measures how far off the transversal is from being 
a group. Because of the associative law, one has 


ar(fg, War f, 9)" =ar(f, ghyar(g, h), (5.9) 


for all f, g, h € F. (Notice that since A is abelian, conjugation by any element h of 
the coset Ax gives the same result a* = x~!ax = h~!ah which we have denoted 
by a”.) 

Now, given a transversal T = {tr|f € F}, any other transversal S = {sy|f € F} 
is uniquely determined by a function @ : F — A, by the rule that 


sf tp B(f), feF. 


Then the new factor system as for transversal S can be determined from the old one 
ar by the equations 


as(f. 9) = ar(f, DLE g)) | B(f)7B(g)] for all f, 9 € F. (5.10) 


The group G is said to split over A if and only if there is a subgroup H of G 
such thatG = HA, HM A = 1—that is, if and only if some transversal H forms 
a subgroup. Such a subgroup H is called a complement of A in G. In that case, the 
factor system ay for H is just the constant function at the identity element of A. 
Then, given G, A and transversal T = {t¢| f € F'}, we can conclude from Eq. (5.10) 
that G splits over A if and only if there exists a function 3: F — A such that 


ar(f, 9) = BUF QBUF) 28g)", forall f,g € F. (5.11) 


Now suppose A is a normal abelian p-subgroup of the finite group G and suppose 
P is a Sylow p-subgroup of G with the property that “P splits over A”. That means 
that P (which necessarily contains A) contains a subgroup B such that P = AB and 
BOA = 1.Let X be atransversal of P in G. Then T := BX := {bx|b € B,x € X} 
is a transversal of A in G whose factor system ar satisfies 


ar(b,t) = 1 forall (b,t) € Bx T. 


Now let m be the p-Sylow index [G : P] = |X| in G. Since m is prime to | Al, 
there exists an integer k such that mk = | mod |A|. Since A is abelian, the factors of 
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any finite product of elements of A can be written in any order. So there is a function 
 y:F — A defined by 


Vf) =[] ore. f): 
Now, multiplying each side of Eq. (5.9) as f ranges over F’, one obtains 


y(h)yy(g)" = y(gh)ar (gh). 


Then, setting G(g) := y(g)~* for all g € F, we obtain Eq. (5.11). Thus G splits over 
A. 
Noting that conversely, if G splits over A, so does P, we have in fact proved, 


Theorem 5.5.1 (Gaschiitz) Suppose G is a finite group with a normal abelian 
p-subgroup A. Then G splits over A if and only if some p-Sylow subgroup of G 
splits over A. 


Next suppose A is a normal abelian subgroup of the finite group G and that 
His acomplement of A in G, so G = HA and HMA = 1. Now suppose some 
second subgroup K was also acomplement to A. Then there is a natural isomorphism 
ju: H —> K factoring through G/A, and a function 3 : H — A so that 


y(h) = hG(h) for allh € H. 


Expressed slightly differently, K is a transversal {k, | € H} whose elements can be 
indexed by H, and k, = h{(h) for all h € H. Since K is a group, it is easy to see 
that forh,g € H, 


kg + kn = hgn = (gh) B(gh) 
= gB(g) -hB(h) 
= ghB(g)" B(h). 


So 
B(gh) = B(g)" Bh), for all g,h € H. (5.12) 
Now assume m = [G : A] is relatively prime to | A], and select an integer n such 
that mn = 1 mod |A|. Then as A is abelian, the product [],,3(g), taken as g ranges 


over H, is a well-defined constant b, an element of A. Forming a similar product of 
the terms on both sides of Eq. (5.12) as g ranges over H, one now has 


b = bh 7 Bh)", 
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so 
B(h) =a-a~", wherea = b",he H. 
Then for every h € H, 
kn =hB(h) =h-a~"-a=a'ha. 
Thus K = a~!Ha is a conjugate of H. So we have shown the following: 


Lemma 5.5.2 /f A is a normal abelian subgroup of a group G of finite order, and 
the order of A is relatively prime to its index [G : A], then any two complements of 
A in G are conjugate. 


The reader should be able to use Gaschiitz’ Theorem and the preceeding Lemma 
to devise an induction proof of the following 


Theorem 5.5.3 (The lower Schur-Zassenhaus Theorem) Suppose N is a solvable 
normal subgroup of the finite group G whose order is relatively prime to its index 
[G : N]. Then the following statements are true: 


I. There exists a complement H of N in G. 
2. Any two complements of N in G are conjugate by an element of N. 


On the other hand, using little more than Sylow’s theorem, a Frattini argument, 
and induction on group orders one can easily prove the following: 


Theorem 5.5.4 (The upper Schur-Zassenhaus Theorem) Suppose N is a normal 
subgroup of the finite group G, such that |N| and its index [G : N] are coprime. If 
G/N is solvable, then the following holds: 


1. There exists a complement H of N in G. 
2. Any two complements of N in G are conjugate by an element of N. 


Remark Both the lower and upper Schur-Zassenhaus theorems contain an assump- 
tion of solvability, either on N or G/N. However this condition can be dropped 
altogether, because of the famous Feit-Thompson theorem [18] which says that any 
finite group of odd order must be solvable. The hypothesis that |N| and [G : N] are 
coprime forces one of N or G/N to have odd order and so at least one is solvable. Of 
course the Feit-Thompson theorem is far beyond the sophistication of this relatively 
elementary course. 


A somewhat over-looked application of the Schur-Zassenhaus theorem is the 
following: 


Theorem 5.5.5 (Glauberman) Let A be a group of automorphisms acting on a group 
G leaving invariant some coset Hx of a subgroup H. Then the following hold: 
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1. His itself A-invariant. 
2. If A and H have coprime orders (so that at least one is solvable), then A fixes an 
element of the coset Hx. 


Proof This is Exercise (10) in Sect.5.6.1. 


5.6 Exercises 


5.6.1 Elementary Exercises 


1. We say that a subgroup A of G is subnormal in G if and only if there is a finite 
ascending chain of subgroups: 


A=No <N,::-:-<Nn=G 


from A to G, with each member of the chain normal in its successor—i. e. 
Ni J Ni+1,1 =0,...,m— 1. 


(a) Using the Corollary to the Fundamental Theorem of Homomorphisms 
(Corollary 3.4.6), show that SN’(G) is a semi-modular lower semi-lattice. 

(b) Use Theorem 2.4.2, part (ii), to conclude that in a group for which SN(G) 
has the descending condition—for example a finite group—the join of any 
collection of subnormal subgroups of a group is subnormal. 


2. Suppose NV is a family of normal subgroups of G. Let XN be defined to be the 
set of elements of G which are expressible as a finite product of elements each 
of which belongs to a subgroup in NV. 


(a) Show that XV is the subgroup of G generated by the subgroups in W—.e. 
IN = WN) := (N|N €N}). 

We say that NV is a normally closed family, if and only if for any non-empty 
subset M C N, (M) € N. A group is said to be a torsion group if and 
only if every element of G has finite order. (This does not mean that the 
group is finite: far from it: there are many infinite torsion groups.) Show 
that the family J (G) of all normal torsion subgroups of G, is a normally 
closed family. [HINT: When any element x is expressed as a finite product 
x = a,-++ay with a; € N; € N, only a finite number of groups N; are 
involved. So the proof comes down to the case k = 2.] Then the group 


(b 


eS 


Tor(G) := (T(G)) € T(G) 


is a characteristic subgroup of G. 
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. Fix a prime number p. A group H is said to be a p-group if and only if evey 


element of H has order a power of the prime p. Show that the collection of all 
normal p-subgroups of a (not necessarily finite) group G is normally closed. (In 
this case the unique maximal member of the collection of all normally closed 
p-subgroups of G is denoted O,(G).) 


. Let G be any group, finite or infinite. 


(a) Show that any subnormal torsion subgroup of G lies in T(G). 
(b) Show that any subnormal p-subgroup of G lies in O,(G). 


Let Q denote the additive group of the rational numbers. The group Zp is 
defined to be the subgroup of Q/Z generated by the infinite set {1/p, 1/p?, ...}. 
Show that this is an infinite p-group. Also show that it is indecomposable—that 
is, itis not the direct product of two of its proper subgroups. 

Let F be a family of abstract groups which is closed under taking unrestricted 
subdirect products—that is, if H is a subgroup of the direct product [[,-);Xo, 
with each X, € F, and the restriction of each canonical projection 


Tr: [1.;%« —> X, 


to H is an epimorphism, then H ¢€ F. 


(a) Show that there exists a unique normal subgroup NV such that NF is minimal 
with respect to being a normal subgroup which yields a factor G/N¢ € F° 

(b) Show that there is always a subgroup G’ of G minimal with respect to these 
properties: 

(i) G'IG, 
(ii) G/G’ is abelian. 

(c) Explain why there is not always a subgroup N of G minimal with respect to 
being normal and having G/N a torsion group (or even a p-group). [Hint: 
Although torsion groups are closed under taking direct sums, they are not 
closed under taking direct products. This contrasts with Exercise (6) in this 
section where the family of groups was closed under unrestricted subdirect 
products. ] 

(d) If G is finite, and F is closed under finite subdirect products—that is, the 
set of possibilities for F is enlarged because of the weaker requirement that 
it is enough that any subgroup AH of a finite direct product of members of F 
which remains surjectve under the restrictions of the canonical projections, 
is still in #—then there does exist a subgroup N+ which is minimal with 
respect to the two requirements: (i) N¢ < G, and (ii) G/N¢ e¢€ F. (If 7 is 
any set of (non-negative) prime numbers, a group G is said to be a 7-group 


Note that as a class of abstract groups F can either be viewed as either (i) a collection of groups 
closed under isomorphisms, or (ii) a disjoint union of isomorphism classes of groups. Under the 
former interpretation, the “membership sign”, €, as in “G/N¢ © F” is appropriate. But in the 
second case, “e” need only be replaced by “an isomorphism with a member of F’’. One must get 
used to this. 
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if and only every element of G has finite order divisible only by the primes 
in 7. In particular, a 7-group is a species of torsion group, and a p-group is 
a species of 7-group. 
Show that all 7-groups are closed under finite subdirect products. [In this 
case we write 

Ny = O"(G), O?(G), or O” (G) 


where 7 = {p} or all primes except p, in the last two cases.] 


7. Suppose F is a family of groups closed under normal products. Suppose G 
possesses a subgroup WN that is maximal with respect to being both normal and 
being a member of F (this occurs when G is finite). Is it true that any subnormal 
subgroup of G which belongs to the family F lies in N? Prove it, or give a 
counterexample. 

8. Suppose A and B are normal subgroups of a group G. Prove the following: 


(a) If A and B normalize subgroup C, then 
[AB, C] = [A, C][B, C]. 


[Hint: Use the commutator identity (5.2) and the fact that B normalizes 
[A, C] to show the containment of the left side in the right side. ] 
(b) Show that the k-fold commutator [AB,..., AB] is a normal product 


] [tX1. X2,..., Xe, 


where the X; = A or B, and the product extends over all 2* possibilities for 

the k-tuples (X1,..., Xx). (Of course each factor in the normal product is 

a normal subgroup of G. But some of them occur more than once since the 

cases with X; = A and X2 = B, are duplicated in the cases with X; = B 

and X2 = A.) 

Show that if A and B are nilpotent of class k and £, respectively, then AB 

is nilpotent of class at most k + €. [Hint: Consider y,.+¢41(A B) and use the 

previous step.] 

(d) Prove that in a finite group, there exists a normal nilpotent subgroup which 
contains all other normal nilpotent subgroups of G. (This characteristic 
subgroup is called the Fitting subgroup of G and is denoted F(G).) 


(c 


wm 


9. Let G be a finite group. Show that if A and B are solvable normal subgroups 
of a group G, then their normal product AB is also solvable. [Hint: Use the 
commutator identities.] Conclude from this that if G is finite, then there exists a 
unique subgroup that is maximal with respect to being both normal and solvable. 
Clearly, it contains all other solvable normal subgroups of Gc? 


This group is sometimes called the “solvable radical”, but this usage, which intrudes on similar 
terminology from non-associative algebras, is far from uniform. 
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10. Prove Theorem 5.5.5. The finite group G contains a subgroup H, and A is a 
group of automorphisms of G which leaves the coset Hx invariant. 


(a) (i) Show that A leaves H invariant. 
(b) (ii) If A and H have coprime orders show that some element of Hx is fixed 
by the automorphism group A. 


[Hint: For the first part, note that for every a € A x* = hx for some element 
ha € H. Compute (hx) for arbitrary h € H. 

For the second part, note that in the semidirect product GA, A normalizes H by 
part (i) so HA = AH isa subgroup of GA. Show that one can define a transitive 
action of HA on the elements of the coset Hx in which the elements of A act 
as they do as automorphisms, but the elements of H act by left multiplication. 
(One must show that the left-multiplication action of h% is the composition of the 
actions of a~!, of h and of a, in that order.) Let X be the subgroup of HA fixing 
a “letter’—say x—in Hx. Since H is transitive, HA = HX. Now X and A are 
two complements of H in the group. Apply the Schur Zassenhaus theorem. ] 


11. Here is an interesting simplicity criterion. Let G be a group acting faithfully 
and primitively on the set X, and let H be the stabilizer of some element of X. 
Assume 


(a) G=G’, 
(b) A contains a non-identity normal solvable subgroup A whose conjugates 
generate G. 


Prove that G is simple. 

[Hint: First note that G = 1| is not possible since A # 1 . The imprimitive 
faithful action of G on X makes G and any non-trivial normal subgroup N of 
G transitive on X so G = HN. Then show AN is normal, and so contains all 
conjugates of A. The final contradiction that is around the corner must exploit 
the solvability of A.] 


5.6.2 The Baer-Suzuki Theorem 


The next few exercises spell out a subtle but well-known theorem known as the 
Baer-Suzuki theorem.’ The proof considered here is actually due to Alperin and 
Lyons [2]. 

In the following, G is a finite group. There is some new but totally natural notation 
due to Aschbacher. Suppose © is a collection of subgroups of G and H is a subgroup 
of G. Then the symbol & M H denotes the collection of all subgroups in © which 
happened to be contained in the subgroup H. (Of course this collection could very 
well be empty.) If y is any collection of subgroups of G, the symbol Ng (7) denotes 


7Not to be confused with “Brauer-Suzuki Theorem”. 
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the subgroup of G of all elements which under conjugation of subgroups leaves the 
collection y invariant. 


1. Suppose Q is a G-invariant collection of p-subgroups of G. Let Q be any 
p-subgroup of G and let y be any subset of 2.9 Q for which y C NoG(q). 
Show that either 


(a) y= QN Qor 
(b) Then there is an element X € 02/9 Q which also lies in NG(q). 


[Hint: Without loss of generality, one may assume that y = QN QM Ng(7). Then 
every element of Ng(NQ(7)) leaves 7 = QM NQ(q) invariant, so No(No(y) = 
No(y). Since Q is a nilpotent group, 7 is a Q-invariant collection of subgroups 
of Q and the conclusion follows easily. ] 


2. Prove the following: 


Theorem 5.6.1 (The Baer-Suzuki Theorem) Suppose X is some p-subgroup of the 
finite group G. Either X < Op(G) or (X, X4) is nota p-subgroup for some conjugate 
XI eX", 


[HINT: Set Q = X°, choose P a p-sylow subgroup of G which contains X, and 
set A := QN/ P. Since A = Q implies (2) is a normal p-group, forcing the 
conclusion X € O,(G), we may assume Q— A # 9G. 

Now show that for any subgroup Y € & — A, (Y, A) cannot be a p-group. [ If R 
is a p-Sylow subgroup containing (R, A), compare |Q2M R| and |Q2 P|.] 

Next consider the set S of all pairs (Y, y) where Y € XF_-A, y C A and (Y, 4) 
is a p-group. (We have just seen that this collection is non-empty.) Among these 
pairs, choose one, say (Y, y) € S, so that || is maximal. Now if y = @, (X) is 
not a p-group and we are done. Thus we may assume ¥ is non-empty. 

Now set Q := (Y, y), a p-group. Then, noting that (Y, AM Q) is a p-group, the 
maximality of 7 forces y = AN Q. Nowy := UyeyX © P © NG(A)andy € Q 
together imply 


7 © Nc(Q)N NG(A) € NG(QN A) = NG). 


By the previous exercise, either y = QNM Q or there is a subgroup Y’ € QNQ-+¥ 
with Y’ C NG(y). The former alternative is dead from the outset because of the 
existence of Y. Now Y’ belongs to X° — A, and (Y’, 7) is a p-subgroup, as it is 
a subgroup of Q. Thus (Y’, ) is also one of the extreme pairs in S. 

Similarly, since (Y’, A) is not a p-group, y # A and so, applying the previous 
exercise once more with P in place of Q, there must exist a subgroup Z € A —y, 
with Z € Np(y). Thus 

{Y’, Z}Uy oC QN Ney). 
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Now Ng () is a proper subgroup of G (otherwise some element of X© lies in the 
normal p-subgroup (7¥)). Now for the first time, we exploit induction on the group 
order to conclude that 

(¥’, yU{Z}) 


is a p-group. Clearly this contradicts the maximality of ||, completing the proof. ] 


3. The proof just sketched is very suggestive of other generalizations, but there are 
limits. We say that a finite group is p-nilpotent if and only if G/O,(G) is a 
p-group. 

Show that the collection of finite p-nilpotent subgroups is closed under finite 
normal products. (In a finite group, there is then a unique maximal normal 
p-nilpotent subgroup, usually denoted O,’»(G).) 

4. Is the following statement true? Suppose X is a p-subgroup of the finite group 
G. Then either X € O,,(G) or else there exists a conjugate X9 € X® such 
that (X, X) is not p -nilpotent. [Hint: Think about the fact that in any group, the 
group generated by two involutions is a dihedral group, and so is 2-nilpotent.] 


Chapter 6 
Generation in Groups 


Abstract Here, the free group on set X is defined to be the automorphism group of 
a certain tree with labeled edge-directions. This approach evades some awkwardness 
in dealing with reduced words. The universal property that any group generated by 
a set of elements X is a homomorphic image of the free group on X, as well as 
the fact that a subgroup of a free group is free (possibly on many more generators) 
are easy consequences of this definition. The chapter concludes with a discussion of 
(k, 1, m)-groups and the Brauer-Ree theorem. 


6.1 Introduction 


One may recall that for any subset X of a given group G, the subgroup generated by 
X, denoted (X), always has two descriptions: 


1. It is the intersection of all subgroups of G which contain X; 
2. It is the subset of all elements of G with are expressible as a finite product of 
elements which are either in X or are the inverse (in G) of an element of X. 


The two notions are certainly equivalent, for the set of elements described in item 
2, is closed under group multiplication and taking inverses and so, by the Subgroup 
Criterion (Lemma 3.2.3) is a subgroup of G containing X. On the other hand it must 
be a subset of every subgroup containing X. So this set is the same subgroup which 
was denoted (X). 

But there is certainly a difference in the way the two notions feel. Our point of 
view in this chapter will certainly be in the spirit of the second of these two notions. 
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6.2 The Cayley Graph 


6.2.1 Definition 


Suppose we are given a group G and a subset X of G. X is said to be a set of 
generators of G if and only if (X) = G. For any such set of generators X, we can 
obtain a directed graph C(G, X) without multiple edges as follows: 


1. The vertex set of C := C(G, X) is the set of elements of G. 

2. The directed edges are the ordered pairs of vertices: (g, gx), where g € G and 
x € X. (It is our option to label such an edge by the symbol “x”. Note that if x 
and y are distinct elements of X, then gx 4 gy so each directed edge receives a 
unique label by this rule.) 


The directed edge-labelled graph C(G, X) is called the Cayley graph of the gen- 
erating set X of the group G. Note that if the identity element | is a member of X, 
loops are possible. But of course 1, being a non-generator, can always be removed 
from X without disturbing any of our assumptions about X and G. If we do this, no 
loops will appear. 

One now notices that /eft multiplication of all the vertices of C (that is, the elements 
of G) by a given element h of G induces an automorphism 7, of C (in fact one 
preserving the labels) since (hg, hgx) is also an edge labelled by “x”. 

Let us look at a very simple example. The set X = {a = (12), b = (23)} is a set 
of generators of the group G = Sym(3), the symmetric group on the set of letters 
{1, 2, 3}. The Cayley graph then has six vertices, and each vertex has two out-going 
edges labeled “a” and “b”, respectively, and two in-going edges labeled “a” and “b” 
as well. Indeed, we could coalesce an outgoing and ingoing edge with the same label, 
as one undirected edge of that label. This only happens when the label indicates an 
involution of the group. In this case we get a hexagon with sides labelled “a” and “b” 
alternately. 


6.2.2 Morphisms of Various Sorts of Graphs 


Since we are talking about graphs, this may be a good time to discuss homomorphisms 
of graphs. But there are various sorts of graphs to discuss. The broad categories are 


1. Simple graphs [ = (V, E) where V is the vertex set, and edges are just cer- 
tain subsets of V of size two. (These graphs record all non-reflexive symmetric 
relations on a set V, for example the acquaintanceship relation among persons 
attending a large party.) 

2. Undirected graphs with multiple edges. (This occurs when the edges themselves 
live an important life of their own—for example when they are the bridges con- 
necting islands, airplane flights connecting cities, or doors connecting rooms.) 
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3. Simple graphs with directed edges. Here the edge set is a collection of ordered 
pairs of vertices. (This occurs when one is recording asymmetric relations, for 
example, which football team beat which in a football league, assuming each 
pair of teams plays each other at most once. Or the the comparison relation of 
elements in a poset. We even allow two relations between points, for example 
(a, b) and (b, a) may both be directed edges. This occurs in some cities where 
sites are connected by a network of both one-way and two-way streets.) 

4. We could have any of the above, with labels attached to the edges. Cayley graphs, 
for example, can be viewed as directed labeled graphs. 


A graph homomorphism f : (Vi, E,) — (V2, E2) is a mapping, f : Vi > V2 
such that if (a,b) € E, then either f(a) = f(b) or (f(a), f(b)) € En. If the two 
graphs are labeled, there are labelling mappings 4; : E; — Lj; assigning to each 
(directed) edge, a label from a set of labels L;, i = 1, 2. Then we require an auxilliary 
mapping f’ : Lj — Lz, so that if (a, b) € E, carries label x and if (f(a), f(b)) is 
an edge, then it too carries label f’(x). 


6.2.3 Group Morphisms Induce Morphisms of Cayley Graphs 


Now suppose f : G — H is a surjective homomorphism of groups. Then it is easy 
to see that if X is a set of generators of G, then also f(X) is a set of generators 
of f(G) = H. Now f is a fortiori a map from the vertex set of the Cayley graph 
C(G, X) to the vertex set of the Cayley graph C(H, f(X)). For (g,x) € G x X, 
(g, gx) is a directed edge labelled x. Then clearly either f(g) = f(gx) or else, 


(f(g), flgx)) = (Fg), f(g) f(&)) is adirected edge of C(H, f(X)), labelled f(x). 
Thus 


Lemma 6.2.1 /f X is a set of generators of a group G and f : G > Hisa 
surjective morphism of groups, then f (X) is a set of generators of H and f induces 
a homomorphism of the Cayley graphs, f : C(G,X) — C(H, f(X)) as labeled 
directed graphs.' 


6.3 Free Groups 


6.3.1 Construction of Free Groups 


The Cayley graphs of the previous section were defined by the existence of a group 
G (among other things). Now we would like to reverse this; we should like to start 
with a set X and obtain out of it a certain group which we shall denote F(X). 


‘Note that the graph morphism has also denoted by the symbol /. This slight abuse of notation is 
motivated by the fact that f is indeed the mapping being applied to both vertices and labels of the 
domain Cayley graph. 
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We do this in stages. 

First we give ourselves a fixed abstract set X. The free monoid on X is the collection 
M(X), of “words” x;x2--- x, spelled with “letters” x; chosen from the alphabet X = 
The number k is allowed to be zero—that is, there is a unique “word” spelled with 
no letters at all—which we denote by the symbol ¢ and call the empty word. The 
number k is called the length of the word which, as we have just seen, can be zero. 

The concatenation of two words u = x, ---x, and w = yiy2---ye,Xi, yj € X, 
is the word, 

U*W I= XL XVI VE. 


The word “bookkeeper” is the concatenation of “book” and “keeper”; 12345 is the 
concatenation 123 * 45 or 12 «345, and so forth. Clearly the concatenation operation 
“e”? is a binary operation on the set W(X) of all words over the alphabet X, and it 
is associative (though very non-commutative). Moreover, for any word w € W(X), 
o*xw = wx*@d = w. So the empty word ¢ is a two-sided identity with respect 
to the operation “x”. Thus M(X) = (W(X), *) is an associative semigroup with 
identity—that is, a monoid. Indeed, since it is completely determined by the abstract 
set X alone, it is called the free monoid on (alphabet) X ° 

Now let 0 : X — X be an involutory fixed-point-free permutation of X. This 
means o is a bijection from X to itself such that 


1. o? = ly, the identity mapping on X, and 
2. a(x) #x for all elements x in X. 


We are going to construct a directed labelled graph T = F(X, o), called a frame 
which is completely determined by X and the involution a. 

Let W*(X) be the subset of W(X) of those words y; y2 --- y-, r any non-negative 
integer, for which y; 4 o(y;+41), for any i = 1,...,7 — 1. These are the words 
which never have a “factor” of the form ya(y). (Notice that these words contain no 
factors of the form o(y)y as well since, o(y)y = ua(u) where u = o(y).) We call 
such words reduced words. 

We now construct a labelled directed graph I’ whose vertices are these reduced 
words in W*(X). The set E of oriented labelled edges are of two types: 


1. Ordered pairs (w, wy) € W*(X) x W*(X) are to be directed edges labelled “y”. 
(Note that the nature of the domain forces the reduced word w not to end in the 
letter o(y)—otherwise wy would not be reduced.) 

2. Similarly those ordered pairs of the form (wy, w) (under the same hypothesis 
that w not end in a(y)) are directed edges labeled by “o(y)”. 


The two types of directed edges are distinguished this way: for an edge of the first 
kind, the “head” of the directed edge is a word of length one more than it’s “tail’’, 
while the reverse is the case for an edge of the second kind. 


2Of course, to be formal about it, one can regard these as finite sequences (x1,..., xx), but the 
analogy with linguistic devices is helpful because the binary operation we use here is not natural 
for sequences. 


3Monoids were introduced in Chap. 1. 
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Then the directed graph Tr = (W*(X), E) which we have defined and have called 


a frame, has these properties: 


if 


2. 


Let V; be the set of words of W*(X) of length k. Then Vo = {¢}, and the vertex 
set partitions as W*(X) = Vo+V, + Vo+.... 

Each vertex has all its outgoing edges labeled by X; it also has all its ingoing 
edges labeled by X. Each directed edge (a, b) that is labeled by y € X, has its 
“transpose” (b, a) labelled by o(y). 


. For k > 0, each vertex in V;, receives just one ingoing edge from V;,_; and gives 


one outgoing edge to the same vertex in V;_1. All further edges leaving or arriving 
at this vertex, are to or from vertices of Vx+1. 


. Given any ordered pair of vertices (w;, w2) (recall the w; are reduced words) 


there is a unique directed path in I from vertex w 1 to vertex wz. It follows that 
there are no circuits other than “backtracks”—that is circular directed walks of 
the form 


(V1, U2, -..Um,~--U2m41) 


where vj = V2m+1—j, (vj, vj+1) and (vj+41, v;) are all directed edges, j = 
1,...m. 


As usual, the symbol Aut(I") denotes the automorphism group of the directed 


labeled graph I’. At this point we define a monoid homomorphism 


pe: M(X) > Aut(T). 


For each y € X, and vertex w in W*(X) define 


yw if w does not begin with letter a(y) 


L(y) (w) = w’ if w has the form o(y)w’. 


It is then quite clear that p(y) is a label-preserving automorphism of T° (this needs 
only be checked at vertices w of very short length where “cancellation” affects the 
terminal vertices). 


Notice that by definition 


u(a(y)) = (u(y), 


the inverse of the automorphism induced by a(y). 


Next, for any word m = yj; y2--- yg in the monoid M(X) we set 


whe = w(m)(w) = WOKE) UOR-DC HO) QW) +++) = WED HOD, 
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that is, ju(7m) is the composition 


Hy) O-+ 0 WO) 


of automorphisms of I’. Of course, since the automorphism group Aut(I’) is assumed 
to act as right operators, we can express this composition as u(y) --- WCy~,) Where 
juxtaposition in this expression denotes multiplication in the group Aut(T’). 

Then, pi(m jm) = um) pu(m2) for all (m), m2) € M(X) x M(X) and (9) = 
lw x). So is a monoid homomorphism 


pe: W(X) > Aut(L). 


Note that yz is far from being 1-to-1. For each y € X, p(yo(y)) = Ip, 
the identity automorphism on [. Thus all compound expressions derived from 
the empty word by successively inserting factors of the form ya(y)—such as 
aba(c)dea(e)a(d)cf ga(g)a(f)a(b)a(a)—also induce the identity automorphism 
of TP. 

We now have 


Lemma 6.3.1 Let G = Aut (I), the group of all label preserving automorphisms 
of the directed graph T. 


(i) The monoid homomorphism 1: M(X) — G is onto. 

(ii) G acts regularly on the vertices of T. 

(iii) The group G is generated by the set js(Y), where Y is any system of represen- 
tatives of the o-orbits on X. 

(iv) Identities of the form (m) = Ip occur if and only if the word m € M(X) is 
formed by a sequence of insertions of factors yja(y;) starting with the empty 
word. 

(v) T is the Cayley graph of G with respect to the set of generators X. 


Proof Suppose g is an element of G fixing a vertex w in I’. Then, as each (in- or 
out-) edge on w bears a unique label, every vertex next to w is fixed. Since every 
vertex u is connected to the vertex ¢, the empty word in W*(X), by an “undirected 
path” of length equal to the length ¢(u), the undirected version of I’ is connected. 
This forces the elements g in the first sentence of this paragraph to fix all vertices 
W*(X) of T. 

But now it is clear that for any word w € W*(X), regarded as a monoid element, 
L(w) is an element of G taking vertex @ to w. Thus (M(X)) = G, and 1. and 2. 
are proved. So is the statement that j1(X) is a set of generators of G in part (iii). 

Suppose now, that w = yi yo--- ye € M(X), satisfies w(w) = 1p. Then succes- 
sive premultiplications by yz, then yg—, and so on, take the vertex ¢ on a journey 
along a directed path with successive arrows labelled yx, yg_1, . . .; eventually return- 
ing to ¢ via a directed edge (y1, ¢). At each stage of this journey, the current image 
of ¢ is either taken to a vertex that is one length further or else one length closer to 
its original “home”, vertex @. So there is a position 
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v= My Yj+1 +++ Yk) 


in which a local maximum occurs in the distance from “home”. This means that both 
v~ = w(yjti- +: Ye)(@) and vt = L(y j—-19; +++ Yk)() are one unit closer to the 
starting vertex ¢ than was v. But as pointed out in the construction of I’, each vertex 
z at distance d from ¢ has only one out-edge and one in-edge to a vertex at distance 
d — 1 from @, and the two closer vertices are the same vertex—say z;— so that the 
two edges in and out of the vertex z are two orientations of an edge on the same pair 
of vertices {z, z1}. It follows therefore that v- = v* and that y;, the label of directed 
edge (v~, v), is the o-image of the label y;_1 of the reverse directed edge (v, v*). 
Thus 


p(w) = Wy y2- ++ yj-2Yj-1Yj;Vj+1- ++ Ye) 
= M(V1+ + Yj-1j+1 "++ Ye) 
=I1pr (6.1) 


Now one applies induction on the length of w’ := y,---yj—2yj41--- YR, to 
complete the proof of (iii). 
Part (iv) is obvious. 


Now let Y be a system of representatives of the c-orbits on X. Then we have 
X=YUo(Y), andYNo(Y) =¥. 


The group G above is canonically defined (one sees) by the set Y alone. To recover 
X one need only formally define X to be two disjoint copies of Y, and invoke any 
bijection Y — Y to define the involutory permutation 0 : X > X. 

Moreover, we see from Part 3 of Lemma 6.3.1 that the only relations that can exist 
among the elements of G, are consequences of the relations (y) 0 pu(o(y)) = l,y € 
X. Accordingly the group G is called the free group on the set of generators Y and 
is denoted F(Y 4 


Remark 


1. (Uniqueness.) As remarked F (Y) is uniquely defined by Y, so if there is a bijection 
f :Y — Z, then there is an induced group isomorphism F(Y) ~ F(Z) 

2. For any frame ® = F(Z, C¢), Aut(®) is a free group F(Zo), for any system Zo 
of representatives of the ¢-orbits on Z. 

3. (The inverse mapping on I’.) Among the reduced words comprising the vertices 
of I’, there is a well-defined involutory “inverse mapping” 


superscript -1 : W*(X) > W*(X) 


4 Although this is the customary notation, unfortunately, it is not all that distinctive in mathematics. 
For example F(X) can mean the field of rational functions in the indeterminates X over the field 
“F”, as well as a myriad of meanings in Analysis and Topology. 
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which fixes only the empty word ¢ and takes the reduced word w = yy y2--- yx 
to the reduced word w7! := O(YR)OR-1) + OO 1)- 

Thus the group-theoretic inverse mapping ju, — (j{i~)~! is extended to the 
vertices of I by the rule that (j1)~! := ju,,-1: that is, the inverse mapping of 
group elements induces a similar mapping on W*(X) via the bijection j: from the 
regular action of G = Aut(I’) on the vertices of I’. 

Thus without ambiguity, we may write y~! for o(y) and Y~! for o(Y), so YN 
yo =9. 

(It is this mapping on X which defines the graph I; there is no mention of a 
special system of representatives of c-orbits. Thus if Y’ is any other such system, 
then F(Y) = F(Y’) with “equals”, rather than the weaker isomorphism sign.) 


Corollary 6.3.2 Any subgroup of a free group is also a free group. 


Proof Let F(Y) be the free group on the set Y, let X = Y UY~!, and let I’ be the 
Cayley graph of F(Y) with respect to the set of generators X. We used I’ to define 
F(Y), for by the preceding lemma F(Y) is the group of all label-and-direction- 
preserving automorphisms of I’. 

Let H be a non-trivial subgroup of F(Y) and let [ be the H-orbit on vertices 
of I which contains the vertex ¢, the empty word. Now each vertex in r'” is a 
reduced word w and there is a unique directed path from ¢ to that vertex; moreover 
the sequence of labels from Y encountered along this directed path, are in fact the 
letters which in their order “spell” the reduced word w comprising this vertex. An 
H-neighbor of @, is a vertex w of I” such that along the unique directed path in 
I from ¢ to w, no intermediate vertex of Tr is encountered. Let Y% denote the 
collection of all H-neighbors of ¢. Note from the remarks above, and the fact that 
His a subgroup of F(X), that if the reduced word w is an H-neighbor of ¢, then so 
is wl. 

Now we construct a new labelled graph on the vertex set '”. There is a directed 
edge (x, y) € [7 x I if and only if x = yw where yw is reduced and w € Y”. 
(This means x = p(y)(w).) In a sense this graph can be extracted from the original 
graph I in the following way: If we think of '” as embedded in the graph I’, we 
are drawing a directed edge from x to y labelled w if an only if the reduced word 
formed by juxtaposing the labels of the successive edges along the unique directed 
path in from x to y is w and w isin Y”. 

Now the graph I” has as its vertices elements of W*(Y”). There is a unique 
directed path in P” connecting any two vertices. Also for every two vertices there 
are no directed edges or exactly two directed edges, (x, y) and (y, x) respectively 
labelled w and w~! fora unique w € Y”. Thus the graph I’ has all of the properties 
that P had except that X has been replaced by Y”. Thus as H is a regularly acting 
orientation- and label-preserving group of automorphisms of ©”, H ~ F Ge ) 
where Y¥ = y# +y = is any partition of Y” such that for each w € Y”, exactly 
one representative of {w, w— 1) lies in Y i . (Notice that we have used the inverse map 
on vertices of F in describing '”.) Thus the graph P/ is a frame F = F(Y", E), 
and H, being transitive, is by Lemma6.3.1 the free group on Y. fi ; 
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Remark If X is finite with at least two elements, it is possible that the subgroup H 
is a free group on infinitely many generators (see Exercise (1) in Sect. 6.5.1). 


6.3.2 The Universal Property 


We begin with a fundamental theorem. 


Theorem 6.3.3 Suppose G is any group, and suppose X is a set of generators of G. 
Then there is a group epimorphism F(X) — G, from the free group on the set X 
onto our given group G, taking X to X identically. 


Proof Here X is simply a set of generators of the group G, and so a subset of G. But 
a free group F(X) can be constructed on any set, despite whatever previous roles the 
set X had in the context of G. One constructs the formal set Y = X + X~! = two 
disjoint copies of X, with the notational convention that the exponential operator —1 
denotes both a bijection X > X—! and its inverse X~! + X so that (x7!)7! = x. 
Recall that any arbitrary element of the free group F(X) was a reduced word in the 
alphabet Y—that is, an element of W*(Y), consisting of those words which do not 
have the form w; yy~!w» for any y € Y. (Recall that these were the vertices of the 
Cayley graph T = I'(Y) which we constructed in the previous subsection.) 

Now there is a mapping f which we may think of as our “amnesia-recovering 
morphism”. For each word w = u,--- yer € W*(Y), f(w) is the element of G we 
obtain when we “remember” that each y; is one of the generators of G or the inverse 
of one. Clearly f(wiw2) = f(w1) f (wz), since this is how elements multiply when 
we remember that they are group elements. Then /f is onto because every element 
of G can be expressed as a finite product of elements of X and their inverses. There 
is really nothing more to prove. 


One should bear in mind that the epimorphism of the theorem completely depends 
on the pair (G, X). If we replace X by another set of generators, Y—possibly of 
smaller cardinality—one gets an entirely different epimorphism: F(Y) > G. 


6.3.3 Generators and Relations 


Now consider the same situation as in the previous subsection: A group G contains 
a set of generators X, and there is an epimorphism f := fG,x : F(X) > G, taking 
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X to X “identically”. But there was on the one hand, a directed labelled graph I" 
(the “frame’”’) that came from the construction in Lemma6.3.1. This was seen to 
be the Cayley graph C(F(X), X). Since X U X7! is also a set of generators of G, 
one may form the Cayley graph C(G, X U X~'). Then by Lemma6.2.1, there is a 
label-preserving vertex-surjective homomorphism of directed labelled graphs 


PISCE), 


from the frame T = F(X U X~!, —1) to the Cayley graph. 

Where I’ had no circuitous directed walks other than backtracks, the Cayley graph 
C(G, X U X—!) may now have directed circuits. In fact it must have them if f is not 
an isomorphism. For suppose f(w1) = f(w2). If wi € wz, there is in T’ a unique 
directed walk from wy to wz. Its image is then a directed circuit in C(G, X UX ae 

Since G acts regularly on its Cayley graph by left multiplication, every directed 
circular walk in C(G, XU X~') can be translated to one that begins and ends at the 
identity element | of G. Ignoring the peripheral attached backtracks, we can take 
such a walk to be a circuit 


(1, x1, 1X2, 1142X3, .., X1X2°+ XE = 1), 4; EX UXT, 


where x1x2--- x, 18 reduced in the sense that fori = 1,...,k —1,x; 4 rut The 
equation 
Xp Xp = 1 


where a reduced word in the generators is asserted to be 1, is called a relation. In 
a set of relations, the words that are being asserted to be the identity are called 
relators. Obviously, the distinction is only of grammatical significance since the 
relators determine the relations and vice versa. 

Let R be a set of relators in the set of generators X; precisely, this means that R 
is a set of reduced words in W*(X U X~!). Then the normal closure of R in the free 
group F(X), is the subgroup (R*™?) F(X)» generated by the set of all conjugates in 
F(X) of all relators in R. Clearly, this is the intersection of all normal subgroups of 
F(X) which contain ?. A group G is said to be presented by the generators X and 
relations R, if and only if 


GX F(X)/(RFO) py. 


This is expressed by writing 
G= (X|R). 


This is the “free-est” group which is generated by a set X and satisfying the 
relations 7e in the very real sense that any other group which satisfies these relations 
is a homomorphic image of the presented group. Stated more precisely we have 
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Theorem 6.3.4 Let (X|R) be a presented group, and let 6: X — H be a function, 
where H is a group with identity element e. Then @ extends uniquely to a homomor- 
phism (X|R) — H if and only if 


(11) 0(41) ++ Ar) = e 
whenever x1X2-+--xX; € R. 


In writing out a presentation, the collection R of words declared to be equal 
to the identity element can be replaced by equations that are equivalent to such 
declarations. For example, if R = {xyx,x7yx} one can replace R by equations 
{yx? = 1, x?y = x74}, or {xy = x7!, x3 y = x7}, Let us look at some examples: 


Example 29 (The infinite dihedral group D5) G = (x, y|x” = y* = 1). As youcan 
see we allow some latitude in the notation. The set of generators here is X = {x, y}; 
but we ignore the extra wavy brackets in displaying the presentation. The set of 
relators is R = {x*, y?}, but we have actually written out the explicit relations. 

This is a group generated by two involutions x and y. Nothing is hypothesized 
at all about the order of the product xy, so it generates an infinite cyclic subgroup 
N = (xy). Now setting u := xy, the generator of N, we see that yx = u~! and 
that x(xy)x = yx so xux = u7!. Similarly ywy = u~!. Moreover, uy = x shows 
Ny = Nx. Thus 

xN=Nx=Ny=yN 


and so N is a normal cyclic subgroup of index 2 in G. One of the standard models 
of this group is given by its action on the integers. 


Example 30 (The dihedral group of order 2n, denoted Dry) Here G = (x, y|x? = 
y* = (xy)” = 1). Everything is just as above except that the element u = xy now 
has order n. This means the subgroup N = (uw) (which is still normal) is cyclic of 
order n, soG = N+XN has order 2n. Note that this group is a homomorphic image 
of the group of Example 29. When n = 2, the group is an elementary abelian group 
of order 4—the so-called Klein fours group. 


Example 31 Consider the following fairly simple presentation: G = (x, y|xy = 
y?x, yx = x*y). It follows that 
1 


yay Sy yea ya Say S ary, 
so that y~! = x (after right multiplying the extreme members of the above equation 
by (xy)7!). But then 


e=xy=yx=ylyx)=y, 


so y = e, implying x = e. In other words, the relations imposed on the generating 
elements of G are so destructive that the group defined is actually the trivial group. 
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The general question of calculating the order of a group presented by generators 
and relations, is not only difficult, but in certain instances can be shown to be an 
impossible task. (This is a consequence of Boone’s theorem on the the so-called 
word problem in group theory.) 


Example 32 (Polyhedral or (k,l, m)-groups) Here k,1, and m are integers greater 
than one. The presentation is 


GHG Sy =O" =I), 
The case that k = / = 2 is the dihedral group of order 2m of Example 30, above. We 


consider a few more cases: 


(1) THE CASE (k= 2,m=2): Here u = xy has order 2. But then x and wu are 
involutions, and the product xu = y has order /. So G is a homomorphic image 
of the dihedral group of order 2/, namely: 


(x, u|x? =u = (xu)') ~ Dy. 


It is not difficult to see that this homomorphism is an isomorphism. 


(2) THE CASE (k = 2,1 =3,m=3): — Here the relations are 


wm 


ey =by)=1, 


There is a group satisfying the relations. The alternating group on four letters 
contains a permutation x = (12)(34) of order 2, an element y = (123) of 
order 3 for which the product u = (12)(34)(123) = (134) has order three, and 
these elements generate Alt(4). Thus there is an epimorphism of the presented 
(2, 3, 3)-group onto Alt(4). But the reader can check that the relations imply 


x(yxy") = yxy. 
The left side is the product of two involutions, while the right side is an involution. 


It follows that the involutions x and x} = yxy* commute. But now the equations 
just presented show that 


2 
x” =x), andx? = xx). 


Thus the abelian fours group K := (x, x1) is normalized by y, so the presented 
(2, 3, 3)-group G has coset decomposition 


G=K+yK+y°K, 


and so has order 12. 
It follows that G is isomorphic to Alt(4). 
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(3) THE CASE (k=2,1=3,m=4): Here x? = y? = (xy)* = 1, soxyxy = 


y~!xy7!x and yxyx = xy~!xy—1. Now conjugation by x inverts y~! y* since 


gy) =F yeh) - @ by =o lyst. 
But 


i 


yy jy? aye = yas) = Gy ay = Gy Sy or) 


and is a conjugate of y—! and so has order 3. But yx is conjugate to xy, so yx 
has order four. Summarizing, 


(i) y has order 3, 
(ii) yxyx has order 2, and 
(iii) y+ yxyx = y*xyx has order 3. 


Thus, 
H := (y, yxyx) 


is a (2,3, 3)-group, and so by the previous case, has order at most 12. But 
obviously 


G = (x,y) =(x,H)=H+ Ax, 


so |G| = 24. But in the case of Sym(4), setting x = (12) and y := (134), we 
see that there is an epimorphism 


G — Sym(4). 
The orders force® this to be an isomorphism, whence 
any (2, 3, 4)-group is Sym(4). (6.2) 


(4 


Ym 


THE CASE (k,/,m) = (2,3,5): Sohere goes. We certainly have 
G = (x, yx? = y? = (xy)? = 1). 


Again set z := xy. Then y = xz has order three. 
Then for x and z, we have the relations 


x2 = 2 — il: (6.3) 
(xz)? = 1. (6.4) 


5Of course this could also be worked out by showing that the Cayley graph must complete itself in 
a finite number of vertices. But the reasons seem equally ad hoc when done this way. 
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But the latter implies 


XZX = zolixz! (6.5) 
xz-lx = zxz. (6.6) 


1 


Now set tf = zxz_°, an involution. Then 


(ty)? = (exz—! - xz)? = (2(zxz)z)? = (22x27)3 (6.7) 
= Oxeiytec? = eex(xzx)e? =O =. (6.8) 
Thus H = (ft, y) is a (2, 3, 3)-subgroup of order divisible by 6, and so is Alt(4). 
It contains a normal subgroup K with involutions f, r := t’ and cae and setting 
Y = (y),onehasH = Y+YrY. 
Next let s be the involution z~!(¢”)z, which, upon writing ¢ and y in terms of x 
and z becomes 


Now 


sy =z 'xzg3xz- xz = (xz) 1 (z3x27)(xz) 
= (xz) zz lz)z- xz 


is a conjugate of x and so has order 2. Thus s inverts y. 
Also 


sr =st?) = (z-'xz3xz)(xz3xz) 
= (xz)! g3xzxz3 (xz) 
= (xz) 123 (27! xz!) 23 (xz) 
= (xz)! (z2x27)(xz) 


is conjugate to z7xz* which has order 3 by Eq. 6.7. Consider now the collection 
M=KHK =H+HsH =H + Hs+Hst + Hst? + Hst? , 


a set of 60 elements closed under taking inverses. We show that M is closed 
under multiplication. Using the three results 


M=H+HsH 
H=Y+4+YryY 
sSY=Ys 
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we have MM C M if and only if 
srs eM. 
But this is true since (sr)? = 1 implies 
srs =rsr€ HsH CM. 


Finally, to get G = M we must show that at least two of the three generators 
x, y or z lies in M. Now M already contains y, and 


s=zlxzxz = (xz)! 23 (xz) = y ey, 


and so M contains z? and z = (z*)?. Now, as |G| = 60 and G contains a 
subgroup of index 5, there is a homomorphism p : G — Sym(5) whose image 
has order at least 30 = Icm{2, 3,5}. Because of the known normal strucure 
of Sym(5) obtained in the previous chapter, the only possibility is that p is a 
monomorphism and that G ~ Alt(5). 

Again, this could have been proved using the Cayley graph. Neither proof is 
particularly enlightening.® 


Example 33 (Coxeter Groups) This class covers all groups generated by involutions 
whose only further relations are the declared orders of the pairwise products of these 
involutions. First we fix abstract set X. A Coxeter Matrix M over X is a function 
M: X x X — NU {ov} which associates with each ordered pair (x, y) € X x X, 
a natural number m(x, y) or the symbol “oo” so that (1) m(x, x) = 1| and (2) that 
m(x, y) = m(y, x)—that is, the matrix is “symmetric”.’ 

The coxeter group W(M) is then defined to be the presented group 


W(M) := (X|R). 


where R = {(xy)"@))|(x, y) € X x X}. Thus W(M) is generated by involutions 
(because of the 1’s on the ‘diagonal’ of M). If X is non-empty, the group is always 
non-trivial. In 1934, H M. S. Coxeter [6] classified the M for which W(M) is a finite 
group. These finite Coxeter groups have an uncanny way of appearing all over a 
large part of important mathematics: Lie Groups, Algebraic Groups, and Buildings, 
to name a few areas. 


6 The real situation is basically topological. It is a fact that a (k, 1, m)-group is finite if and only if 
W/k+1/l+1/l> 1. 


The natural proof of this is in terms of manifolds and curvature. Far from discouraging anyone, 
such a connection shows the great and mysterious “‘interconnectedness” of mathematics! 
7 Of course if X is finite, it can be totally ordered so that the data provided by M can be encoded 
in an ordinary matrix. We just continue to say “matrix” in the infinite case—perhaps out of love of 
analogy. 
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Remark As remarked on p. 174, there are many instances in which the choice of 
relations, 7 may very well imply that the presented group has order only 1. For 
example if G := (x|x? = x73? = 1) then G has order 1. In fact, if we were to select 
at random an arbitrary set of relations on a set X of generators, the odds are that 
the presented group has order 1. From this point of view, it seems improbable that 
a presented group is not the trivial group. But this is illusory, since any non-trivial 
group bears some set of relations, and so the presented group with these relations is 
also non-trivial in these cases. So most of the time, the non-trivial-ness of a presented 
group is proved by finding an object on which the presented group can act in a non- 
trivial way.® Thus a Coxeter group W(M) is known not to be trivial for the reason that 
it can be faithfully represented as an orthogonal group over the real field, preserving 
a bilinear form B := By : V x V — R defined by M. Thus Coxeter Groups are 
non-trivially acting groups, in which each generating involution acts as a reflection 
on some real vector space with some sort of inner product. That is the trivial part of 
the result. The real result is that the group acts faithfully in this way. This means that 
there is no product of these reflections equal to the identity transformation unless 
it is already a consequence of the relations given in M itself. This is a remarkable 
theorem. 


6.4 The Brauer-Ree Theorem 


If G acts on a finite set Q, and X is any subset of G, we let v(X) be defined by the 
equation 


(X) = Din on -orbit Til — D: 


where the sum ranges over the (X)-orbits ['j induced on Q. Thus if 1 is the identity 
element, v(1) = 0, and if ¢ is an involution with exactly k orbits of length 2, then 
v(t) =k. 


Theorem 6.4.1 (Brauer-Ree) Suppose G = (x1, ...,Xm) acts on Q and x1 +++Xm = 
1. Then 


> Ven = 2G). 


Proof Let Q = Qy +--+ + Q, be a decomposition of Q into G-orbits, and for any 
subset X of G, set 


MLK) De aeieiis in 9, il md 


8It is the perfect analogue of Locke’s notion of “substance”: something on which to hang properties. 
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so V(X) = >°1,(X). By induction, if k > 1, vi (xj) = 2({Qi| — D, so 


DYED = Dy, MOA = D202 — D = 2G), 


Thus we may assume k = 1, so G is transitive. 

Now each x; induces a collection of disjoint cycles on Q. We propose to replace 
each such cycle of length d, by the product of d — | transpositions fy - - - tg_1. Then 
the contribution of that particular cycle of x; to v(xj) is \v(t;) = d — 1 and x; 
itself is the product of the transpositions which were used to make up each of its 
constituent cycles. We denote this collection of transpositions T (x; ) so 


V(Xi) = au a Laas 


in an appropriate order. Since x; --- x» = 1, we have 


TL ceiaaast =; 


or, letting T be the union of the T (x;), we may say [[,<7¢ = 1, with the transpositions 
in an appropriate order. But as i ranges over 1,...m, the cycles cover all points 
of Q, since G is transitive. It follows that the graph G(T); = (Q, 7), where the 
transpositions are regarded as edges, is connected. It follows that 


(T) = Sym(Q). 


But Sym(2) is transitive and so v(Sym(Q)) = v(G). Since IIrt = 1, (F)-= 
Sym(Q), and >) v(t) = 0,1 (x), it suffices to prove the asserted inequality for the 
case that G = Sym(Q2) and the x; are transpositions. 

So now, G = Sym(), x1 ---Xm = 1, where the x; are a generating set of 
transpositions. Among the x; we can select a minimal generating set {51,..., 5x} 
taken in the order in which they are encountered in the product x] -- +x. We can 
move these 5; to the left in the product |]; at the expense of conjugating the elements 
they pass over, to obtain 


S1S2-°° SV V2--- w= 1, k+l=m, 


where the y;’s are conjugates of the x;’s. Now by the Feit-Lyndon-Scott Theorem 
(Theorem 4.2.5, p. 112), the product 5; ---s5, is an n-cycle (where n = |2|). Thus 
k > n-—1. But then y; --- y is the inverse n-cycle, and sol > n — 1, also. But 
k+1l=m= > v(x) andn — 1 = v(G), so the theorem is proved. 
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6.5 Exercises 


6.5.1 Exercises for Sect. 6.3 


1. Show that if |X| > 1, then it is possible to have |X| finite, while |Y” | is infinite. 
That is, a free group on a finite number of generators can have a subgroup which 
is a free group on an infinite number of generators. [Hint: Show that one may 
select members of Y” first, and choose them with unbounded length. 

2. This is a project. Is it possible for a group to be a free group on more than one 
cardinality of generator? The question comes down to asking whether F(X) ~ 
F(Z) implies there is a bijection X — Z. Clearly the graph-automorphism 
definition of a free group is the way to go. 

3. Another project: What can one say about the automorphism group of the free 
group F(X)? Use the remarks above to show that it is larger than Sym(X). [Hint: 
Consider the free group on one generator]. 

4. Suppose G = (x, y|R). Show that G = {e}, the identity group, if 7 implies 
either of the following relations: 


(a) yx =y?x,xy? = yx, 
Oar =s 42 7S 


5. Prove that 


1 


(x, y|x* =e, y = x’, yxy = yx?) Y~(r,8, t|r? =f=fP= rst). 


6. Prove that the group 
G = (x, ylx? = y? = (xy)? =e) 


has order at most 12. Conclude that G ~ Alt(4). 
7. Prove that 

(a) |(x, ylx? = y? = (xy)* = e)| = 24 

(b) |(x, ylx? = y? = (xy)? = e)| = 60 


8. Let Qyari t= (x, yx" =e, y2 = x2", yxy7! = x71). Show that |Qoa+1| = 


24+! When a > 1, the group Q5a+1 is called the generalized quaternion group. 
It contains a unique involution. Can you prove this? 

9. Suppose G is a (k,/, m)-group. Using only the directed Cayley graph C(G, X) 
where X is the set of two elements {x, y} of orders 2 and 4, Show that 


(a) any (2, 4, 4)-group has order 24 and so is Sym(4). 
(b) any (2, 3, 5)-group has order 60. 


10. The following chain of exercises was inspired by a problem in a recent book 
on Group Theory. That book asks the student to show that any simple group of 
order 360 is the alternating group on six letters. As a “Hint”, it asks the reader 
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to consider the index of a 5-Sylow normalizer. This would be a useful hint if 
there were just six 5-Sylow subgroups. Unfortunately, there are 36 of them, and 
the proof does not appear to be a simple consequence of group actions and the 
Sylow theorems. However the statement can be proved as a useful application 
of (k, 1, m)-groups. 


In the following G is a simple group of order 360. The reader is asked to prove 
that G is isomorphic to the alternating group on six letters. It will suffice to prove 
that it has a subgroup of index 6. In this series of exercises, the student is asked 
to assume 


(H) There is no subgroup of index 6 in G. 
and show that this leads to a contradiction. 


(1) Show that G contains 36 5-Sylow subgroups, and so contains 144 elements 
of order 5. [Hint: Use Sylow’s theorem, the simplicity of G, and (H).] 

(2) The 3-Sylow subgroups are abelian, and each intersects its other conjugates 
only at the identity element. [Hint: If x were an element of order three 
contained in two distinct 3-Sylow subgroups, then the number d of 3-Sylow 
subgroups contained in Cg (x) is 4 or 40 and the latter choice makes |Cg(x)| 
large enough to conflict with (H). So T = Syl3(Cg(x)) has 4 elements acted 
on transitively by conjugation, with one of the 3-Sylows inducing a 3-cycle. 
If F is the kernel of the action of Cg (x) on T, then Cg(x)/F is isomorphic 
to Alt(4) or Sym(A4). The latter choice forces Cg (x) to have index 5. Show 
that Cg (x) now contains a normal fours group K. Since K is normalized by 
a 2-Sylow subgroup containing it, NG(K) now has index at most 5 against 
(A) and simplicity. ] 

(3) The collection 2 = Syl3(G) has 10 members, G acts doubly transitively 

on Q and G contains 80 elements of order 3 or 9. [Hint: If |Q2| = 40, there 

are 320 non-identity elements of 3-power order and this conflicts with (1).] 

Prove the following: 

i. Every involution ¢ fixes exactly two letters of Q. 

ii. Ift normalizes 3-Sylow subgroup P, then ¢ centralizes only the identity 
element of P. 

iii. The 2-Sylow subgroup of Ng(P) is cyclic of order 4. 

iv. P is elementary abelian. 
[Hint: Since ¢ must act on the elements of &2 — {P} exactly as it does 
on the elements of P, ¢t must fix 2 or 4 letters of Q. But it must also fix 
an odd number of letters (why?). Now that t-'at =a! foralla € P, 
t is the unique involution in a 2-Sylow subgroup of Ng(P) (explain). 
The other parts follow. ] 

(5) There are no elements of orders 6, 10 or 15. 

(6) Fix an involution ¢ and let A be the full collection of 40 subgroups of order 3 
in G. (Of course these are partitioned in 10 sets of four each, by the 3-Sylow 
subgroups.) By Step (5), we know that if y is an element of order 3, then the 
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product ty has order 2, 3, 4 or 5. From our results on (k, /, m)-groups, we 
have a corresponding partition of A with these components: 


Az := {A € Al(t, A) = Sym(3)} (6.9) 
A3 := {A € Al(t, A) = Alt(4)} (6.10) 
Ag := {A € Al(t, A) = Sym(4)} (6.11) 
As := {A € Al(t, A) = Alt(S)}, (6.12) 

Prove the following: 

G) |A2| = 8 

(ii) |A3| = 8: 

(iii) |A4| = 8 


[Hint: For (i), count Z3’s inverted by f. For (ii), let K(f) be the set of 
Klein fours groups K which contain ¢. (There are just two.) Each such 
member is normalized by four members of A, and, of course, these four 
belong to A3. Similarly for (iii), let ’(t) be the fours groups which are 
normalized by ¢ but do not contain r. There is a bijection between K’ (rt) 
and the set of involutions of Cg (t)—{t}. If K € K’(t), then t normalizes 
two of the elements of Ain NG(K) ~ Sym(4), but generates the whole 
NG(K) with the other two.] 


(7) Conclude that As is not empty. 


6.5.2 Exercises for Sect. 6.4 


. Use the Brauer-Ree Theorem to show that Alt(7) is not a (2, 3, 7)-group. [Hint: 
Use the action of G = Alt(7) on the 35 3-subsets of the seven letters. (We will call 
these 3-subsets “triplets”.) With respect to this action show that an involution fixes 
exactly seven triplets, and so has v-value 14. Any element of order 3 contributes 
vy = 22, and that an element of order 7 fixes no triplet, acts in five 7-cycles and 
contributes v = 30.] 


6.6 A Word to the Student 


6.6.1 The Classification of Finite Simple Groups 


The Jordan-Holder theorem associates with each finite group, a multiset of finite 
simple groups—the so-called chief factors. What are these groups? Many many the- 
orems are known which, under the induction hypothesis reduce to the case that a 
minimal counter-example to the theorem, say G, is a finite simple group. If one had 
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a list of these groups, then one need only run through the list, verifying that each 
of these groups either satisfies the conjectured conclusion or belongs to some other 
class of groups forbidden by the hypothesis.This is how the classification of maximal 
subgroups of the finite classical groups and the classification of flag-transitive linear 
spaces proceeded [13]. 

After decades of intensive effort, the classification of finite simple groups was 
finally realized in 2004 when the last steps, classifying the so-called “quasi-thin 
groups” was published (M. Aschbacher and S. Smith) [2-4]. 

Here is the over-all conclusion: 


Theorem 6.6.1 Suppose G is a finite simple group. Then G belongs to one of the 
following categories: 


I. Gis cyclic of prime order. 

2. G is analternating group on at leas five letters 

3. G belongs to one of 16 families of groups defined as automorphism groups of 
objects defined by buildings. 

4. G is one of a strange list of 26 so-called “sporadic groups”. 


The sporadic groups have no apparent algebraic reason for existing, yet they do 
exist. If anything should spur the sense of mystery about our logical universe it is 
this! (The student may consult a complete listing of these groups in the books of 
Aschbacher [1], Gorenstein [9] or Greiss [10].) 

There have been many estimates of the number of pages that a convincing proof 
of this grand theorem would require. For some authors it is 10,000 for others 30,000. 

Like similar endeavors in other branches of mathematics, a great deal of the 
difficulty is encountered when the degrees of freedom are small. For finite groups, 
this occurs for groups with “small” 2-rank; that is, groups of odd order and groups 
whose 2-Sylow subgroup has a cyclic subgroup of index 2. Here, the theory of group 
characters, both ordinary, exceptional and modular, plays a central role. 

Group characters are certain complex-valued functions on a finite group which 
are indicators of ways to represent a finite group as a multiplicative group of matrices 
(a “representation” of the group). This beautiful subject involves a vast interplay of 
ring theory, linear algebra and algebraic number theory. No book on general algebra 
could begin to do justice to this topic. Accordingly, the interested student is advised 
to seek out books that are entirely devoted to representation theory and/or character 
theory. The classic text for representation theory is Representation Theory of Finite 
Groups and Associative Algebras by Curtis and I. Reiner [7]. Excellent books on the 
theory of group characters are by M. Isaacs [12], W. Feit [8], M. J. Collins [5] and 
L. C. Grove [11], but there are many others. 
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Chapter 7 
Elementary Properties of Rings 


Abstract Among the most basic concepts concerning rings are the poset of ideals 
(eft, right and 2-sided), possible ring homomorphisms, and the group of units of the 
ring. Many examples of rings are presented—for example the monoid rings (which 
include group rings and polynomial rings of various kinds), matrix rings, quaternions, 
algebraic integers etc. This menagerie of rings provides a playground in which the 
student can explore the basic concepts (ideals, units, etc.) in vivo. 


7.1 Elementary Facts About Rings 


7.1.1 Introduction 


In the next several parts of this course we shall take up the following topics and their 
applications: 


1. The Galois Theory. 

2. The Arithmetic of Integral Domains. 
3. Semisimple Rings. 

4. Multilinear Algebra. 


All of these topics require some basic facility in the language of rings and modules. 

For the purposes of this survey course, all rings possess a multiplicative identity. 
This is not true of all treatments of the theory of rings. But virtually every major 
application of rings is one which uses the presence of the identity element (for 
example, this is always part of the axiomatics of Integral Domains, and Field Theory). 
So in order to even have a language to discuss these topics, we must touch base on 
a few elementary definitions and constructions. 


7.1.2 Definitions 


Recall from Sect. 1.2 that a monoid is a set M with an associative binary operation 
which admits a two-sided identity element. Thus if (, -) is a monoid, there exists 
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an element “1” with the property, that | -m = m-1=m forallm € M. Notice that 
Proposition 1.2.1 tells us that such a two-sided identity is unique in a monoid. Indeed 
the “1” is also the unique left identity element as well as the unique right identity 
element. 

A ring is a set R endowed with two distinct binary operations, called (ring) 
addition and (ring) multiplication—which we denote by “+” and by juxtaposition 
or “.”, respectively—such that 


Addition laws: (R, +) is an abelian group. 
Multiplicative Rules: (R, -) is a monoid. 
Distributive Laws: For all a,b,c € R, 


a(b+c)=ab+ac (7.1) 
(a+b)c =ac+be (7.2) 


Just to make sure we know what is being asserted here, for any elements a, b and 
cof R: 


l.a+(64+c)=(a+b)+c 

2,.a+b=b+a 

3. There exists a “zero element” (which we denote by Or, or when the context is 
clear, simply by 0) with the property that Or + a = a for all ring elements a. 

4. Givena € R, there exists a unique element (—a) in R such that a + (—a) = Op. 

5. a(bc) = (ab)c. 

6. There exists a multiplicative “identity element” (which we denote by 1p or just 
by | when no confusion is likely to arise) with the property that lrpa = alr =a, 
for all ring elements a € R. 

7. Then there are the distributive laws which we have just stated—that is, ifa, b,c € 
R, then a(b+c) =ab+ac and (a+b)c =ac+be. 


As we know from the group theory, the zero element Or is unique in R. Also 
given element x in R, there is exactly one additive inverse —x, and that —(—x) = x. 
The reader should be warned that in the axioms for a ring, the zero element may not 
be distinct from the multiplicative identity 1 (although this is required for integral 
domains as seen below). If they are the same, it is easy to see that the ring contains 
just one element, the zero element. 


Lemma 7.1.1 For any elements a, b in R: 


I. (—a)b = —(ab) = a(—b), and 
2. 0-b=b-0=0. 


Some notation: Suppose X and Y are subsets of the set R of elements of a ring. 
We write 


X+Y:={x+ylx eX,yeYV} (7.3) 
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XY := {xy|x € X, ye Y}, (7.4) 
—X := {-x|x € Xx} (7.5) 


Species of Rings 


e A ring is said to be commutative if and only multiplication of ring elements is 
commutative. A consequence of the distributive laws in a commutative ring is the 
famous “binomial theorem” familiar to every student of algebra: 


Theorem 7.1.2 (The Binomial Theorem) /fa and b are elements of a commutative 
ring R, then for every positive integer n, 


(a+b) = > [nl/(n— wiktato*. 


k=0 


e A division ring is a ring in which (R* := R — {0}, -) is a group. 

e A commutative division ring is called a field. 

e A commutative ring in which (R*,-) is a monoid—i.e. the non-zero elements 
are closed under multiplication and possesses an identity—is called an integral 
domain. 


There are many other species of rings—Artinian, Noetherian, Semiprime, Primi- 
tive, etc. which we shall meet later, but there is no immediate need for them at this 
point. 

There are many examples of rings that should be familiar to the reader from 
previous courses. The most common examples are the following: 


Z, the integers, an integral domain, 

Q, the field of rational numbers, 

R, the field of real numbers, 

C, the field of complex numbers, 

Z/(n), the ring of integral residue classes mod n, 

K [x], the ring of polynomials with coefficients from the field K. 

M,,(R), the ring of n-by-n matrices with entries in a commutative ring R, under 
ordinary matrix addition and matrix multiplication. 


More examples are developed in Sects. 7.3 and 7.4. In particular the reader will 
revisit polynomial rings as part of a general construction. 

Finally, if R is a ring with multiplicative identity element 1, then a subring of R 
is a subset S of R such that is itself is a ring under the addition and multiplication 
operations of R. Thus 
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1. Sis a subgroup of (R, +)—i.e. S = —S and $+ S = S. In particular Or € S. 

2. SS C S—i.e. S is closed under multiplication. 

3. S contains a multplicative identity element es (which may or may not be the 
identity element of R). 


Thus the ring of integers, Z, is a subring of any of the three fields Q, R and C, of 
rational, real and complex numbers, respectively. But the set of even numbers is not 
a subring of the ring of integers. 

If R is an arbitrary ring and B is any subset of R, the centralizer of B in R will 
denote the set 

Cr(B) := {r € R\rb = br for all b € B}. 


We leave the reader to verify that the centralizer Cr(B) is a subring of R (Exercise 
(6) in Sect. 7.5.1). In the case that B = R, the centralizer of B is called the center 
of R.' 


7.1.3 Units in Rings 


Suppose R is a ring with multiplicative identity element 1. We say that an element 
x in R has a right inverse if and only if there exists an element x’ in R such that 
xx’ = Lp. Similarly, x has a left inverse if and only if there exists an element x” in 
R such that x""x = 1p. (If R is not an integral domain these right and left inverses 
might not be unique.) An element x in R is called a unit if and only if it possesses a 
two-sided inverse, that is, an element that is both a right and a left inverse in R. The 
element |r, of course, is a unit, so we always have at least one of these. We observe 


Lemma 7.1.3 /. Anelement x has aright inverse if and only ifx R = R. Similarly, 
x has a left inverse if and only if Rx = R. 
2. If both a right inverse and a left inverse of an element x exist, then these inverses 
are equal, and represent the unique (two-sided) inverse of x. 
3. The set of units of any ring form a group under multiplication. 


Proof Part 1. If x has a right inverse xR contains xx’ = lpr, so xR contains R. 
The converse, that xR = R implies that xx’ = 1p, for some element x’ in R, is 
transparent. The left inverse story follows upon looking at this argument through a 
mirror. 
Part 2. Suppose x; and x, are respectively left and right inverses of an element x. 
Then 
Xr = Ar) xr = (41x) Xr = x (KXy) = 11 R) = X1- 


Thus x~! := x; is a two sided inverse of x and is unique, since it is equal to any right 
inverse of x. 


‘Sometimes the literature calls this the centrum of R. 
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Part 3. Suppose x and y are units of the ring R with inverses equal to x~! 


and 
y—|, respectively. Then xy has y~!x~! as both a left and a right inverse. Thus xy is a 
unit. Thus the set U(R) of all units of R is closed under an associative multiplication, 
has an identity element with respect to which every one of its elements has a 2-sided 


inverse. Thus it is a group. 


A ring in which every non-zero element is a unit must be a division ring and 
conversely. 


Example 34 Some examples of groups of units: 


1. In the ring of integers, the group of units is {+1}, forming a cyclic group of order 
2 

2. The integral domain Z © Zi of all complex numbers of the form a + bi with a, b 
integers and i = /—1, is called the ring of Gaussian integers. The units of this 
ring are {+ 1, +7} forming the cyclic group of order four under multiplication. 

3. The complex numbers of the form a+ b(e”'”/>), a and b integers, form an integral 
domain called the Eisenstein numbers. The units of this ring are {1, z, 2,23 
—1, 4, 2°}, where z = —e*!'"/3, forming a cyclic group of order 6. 

4. The number \ := 2+ V5 is a unit in the domain D := {a+ bV5|a, b € Z} since 
(2 + /5)(—2 + V5) = 1. Now A is a real number, and since 1 < 4, we see that 
the successive powers of \ form a monotone increasing sequence, and so we may 
conclude that the group U(D) of units for this domain is an infinite group. 

5. In the ring R := Z/Zn of residue classes of integers mod n, under addition and 
multiplication of these residue classes, the units are the residue classes of the 
form w+ Zn where w is relatively prime to n (of course, if we wish, we may take 
0 < w < n—1.) They thus form a group of order ¢(7), where ¢ denotes the Euler 
phi-function. Thus the units of R = Z/8Z are {1+ 8Z,3+8Z,5+8Z, 7+ 8Z}, 
and form a group isomorphic to the Klein fours group rather than the cyclic group 
of order four.” 

6. Inthe ring of n-by-n matrices with entries from a field F’, the units are the invertible 
matrices and they form the general linear group, GL(n, F’), under multiplication. 


7.2 Homomorphisms 


Let R and S be rings. A mapping f : R > Sis called a ring homomorphism if and 
only if 


1. For all elements a and b of R, f(a+b) = f(ia)+ fd). 
2. For all elements a and b of R, f(ab) = f(a) f(b). 


?See p. 75 for the general definition of “fours group”. 
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A basic consequence of Part | of this definition is that f induces ahomomorphism 
of the underlying additive group, so, from our group theory, we know that f(—a) = 
—f(a), and f(Or) = Os. In fact any polynomial expression formed from a finite 
set of elements using only the operations of addition, multiplication and additive 
inverses gets mapped by f to a similar expression with all the original elements 
replaced by their f-images. Thus: 


f(a + bc? + (—c)bela) = (F@ + FO)(FOY + -fFOMFO FON @). 


As was the case with groups, the composition go f : R — T of two ring 
homomorphisms f : R > S andg: S — T, must also be a ring homomorphism. 
The homomorphism is onto or is an epimorphism or a surjection, is injective, or 
is an isomorphism, if and only if these descriptions apply to the induced group 
homomorphism of the underlying additive groups. 

An automorphism of a ring R is an isomorphism a : R — R. The composi- 
tion 2 o a of automorphisms is itself an automorphism, and the collection of all 
automorphisms of R form a group Aut(R) under the operation of composition of 
automorphisms. 

An antiautomorphism of a ring R is an automorphism «a of the additive group 
(R, +) such that for any elements a and b of R, 


a(ab) = a(b)a(a). 


Note that since the multiplicative identity element of a ring R is unique (see the 
first paragraph of Sect.7.1.2, p. 186), both automorphisms and antiautomorphisms 
must leave the identity element fixed. If a is an automorphism, the set Cr(a) of 
elements left fixed by a form a subring of R containing its multiplicative identity 
element. However, if a is an antiautomorphism, the fixed point set, while closed under 
addition, need not be closed under multiplication. But, ifby chance a subring S of R is 
fixed elementwise by the antiautomorphism a then S must be a commutative ring. In 
fact, for any commutative ring R, the identity mapping on R is an antiautomorphism 
(as well as an automorphism). 

Antiautomorphisms and automorphisms of a ring R can be composed with these 
results: 


antiautomorphism o antiautomorphism = automorphism 
antiautomorphism o automorphism = antiautomorphism 


automorphism o antiautomorphism = antiautomorphism 


Example 35 The complex conjugation mapping that sends a = a+ bi toa = 
a — bi, where a and BD are real numbers, is certainly a familiar example of a field 
automorphism. Now let F be any field, let n be a positive integer, and let o be any 
automorphism of the field F’, and let M,,(F) be the ring of all n x n matrices having 
entries in the field F. The operations are entry-wise addition of matrices, and ordinary 
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matrix multiplication. IF X ¢ M,(F), the expression X = (x;;) means that x;; is 
the entry in the ith row and jth column of X. 


(a) The mapping m(o): S : M,(F) — M,(F) which replaces each matrix entry 
by its o-image, is transparently an automorphism of the ring M,(F). 

(b) The transpose mapping, T : M,(F) — M,(F) replaces each entry aj; by aji 
for any matrix A = (ajj) € M,(F). Suppose A = (ajj) and B = (b;;) are 
elements of M,,(F). Then the (i, 7)-entry of the matrix AB is the “dot” product 
of the ith row of A with the jth column of B—that is, >°, aixbx, ;- On the other 
hand, the (ji)th entry of B’ A’ is the dot product of the jth row of B? (which 
is j-column of B) and the ith column of A! (which is the ith row of A—the 
(i, j)th entry of AB. Thus this product is the transpose of the matrix AB, and 
we may write 

B' AT =(AB)’. 


Since T respects addition, we see that the transpose mapping is an antiautomor- 
phism of the ring M,(F). 
One can then compose the two mappings to obtain the antiautomorphism 


(c 


wm 


SoT=ToS:M,(F) > M,(F), 


which we call the o-transpose mapping. 


7.2.1 Ideals and Factor Rings 


Again let R be aring. A left ideal is a subgroup (J, +) of the additive group (R, +), 
such that RJ C J. Similarly, a subgroup J < (R,+) is a right ideal if JR C J. 
Finally, a (two-sided) ideal is a left ideal which is also a right ideal. The parentheses 
are to indicate that if we use the word “ideal” without adornment, a two-sided ideal 
is intended. A subset X of R is a two sided ideal if and only if 


pee ane a (7.6) 
eee a G7 
REPRE CK. (7.8) 


Note that while a 2-sided ideal J C R is closed under both addition and multiplica- 
tion, it cannot be regarded as a subring of R unless it possesses its own multiplicative 
identity element. 

If J is a two sided ideal in R, then, under addition, it is a (normal) subgroup 
of the commutative additive group (R,+). Thus we may form the factor group 
R/T := (R, +)/U, +) whose elements are the additive cosets x + J in R. 
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Not only is R/J an additive group, but there is a natural multiplication on this set, 
induced from that fact that all products formed from two such additive cosets, x + I 
and y + J, lie in a unique further additive coset. This is evident from 


(«w+ Dyt4+)) Cxy+xl + y+ i? Cxy+l4+74+1=xy4+ I. (7.9) 


With respect to addition and multiplication, R/J forms a ring which we call the factor 
ring. Moreover, by Eq. (7.9), the homomorphism of additive groups R > R/TJ isa 
ring epimorphism. 


7.2.2 The Fundamental Theorems of Ring Homomorphisms 


Suppose f : R — Sis ahomomorphism of rings. Then the kernel of the homomor- 
phism f is the set 
ker f := {x € R| f(x) = Os}. 


We have 


Theorem 7.2.1 (The Fundamental Theorem of Ring Homomorphisms) 


1. The kernel of any ring homomorphism is a two-sided ideal. 

2. If f : R ~ S isa homomorphism of rings, then there is a natural isomorphism 
between the factor ring R/ ker f and the image ring f (R), which takes each coset 
x +ker f to f(x). 


Proof Part 1. Suppose f : R — Sis aring homomorphism and that x is an element 
of ker f. Then as ker f is the kernel of the group homomorphism of the underlying 
additive groups of the two rings, ker f is a subgroup of the additive group of R. 
Moreover, for any element r in R, f(rx) = f(r) f(x) = f(r))Os = Og. Thus 
Riker f) + (ker f)R C ker f, and so ker f is a two-sided ideal. 

Part 2. The image ring f (R) is a subring of S formed from all elements of S which 
are of the form f(x) for at least one element x in R. We wish to define a mapping 
ob: R/(ker f) > f(R) which will take the additive coset x + ker f to the image 
element f (x). We must show that this mapping is well-defined. Now if x + ker f = 
y+ker f, thenx+(—y) € ker f,so0s = f(x+(-y)) = f(x) + (f(y), whence 
f(x) = f(y). So changing the coset representative does not change the image, and 
so the mapping is well-defined. (In fact the ¢-image of x + ker f is found simply by 
applying f to the entire coset, that is, f(x + ker f) = f(x).) 

The mapping ¢ is easily seen to be a ring homomorphism since 


(x + ker f) + (y + ker f)) = ox + y + ker f) = f(x +y) 
= f(x) + f(y) = oe + ker f) + o(y + ker f) 
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O(@ + ker f)(y + ker f)) = oy + ker f) = f(xy) 
= f(x) f(y) = o@ + ker f)- o(y + ker f). 


Clearly ¢ is onto since by definition, f(x) = ¢(x + ker f) for any image element 
f(x). If Oe +ker f) = oy +ker f) then f(x) = f(y) so f(x + (—y)) = 0s, and 
this gives x + ker f = y + ker f. Thus ¢ is an injective mapping. 

We have assembled all the requirements for ¢ to be an isomorphism. 


Before leaving this subsection, there is an important class of ring homomorphisms 
which play a role in the theory of R-algebras.* 

Let A and B be two rings that contain a common subring R lying in the center of 
both A and B. Aring homomorphism f : A — B forwhich f(ra) = rf(a) = f(a@r 
for all ring elements a € A andr € R is called an R-homomorphism. (We shall 
generalize this notion to R-modules in the next chapter.) 


7.2.3 The Poset of Ideals 


We wish to make certain statements that hold for the set of all left ideals of R as 
well as for the set of all right ideals and all (two-sided) ideals of R. We do this by 
asserting “X holds for (left, right) ideals with property Y”’. This means the statement 
is asserted three times: once about ideals with property Y, again about left ideals 
with property Y, and again about right ideals with property Y. It does save a little 
space, and helps point out a uniformity among the theorems. But on the other hand it 
makes statements difficult to scan, and sometimes renders the statement a little less 
indelible in the student’s memory. So we will keep this practice at a minimum. 


Lemma 7.2.2. Suppose A and B are (left, right) ideals in the ring R. Then 


Il. A+ B:= {a+ b\(a,b) € a x B} is itself a (left, right) ideal and is the unique 
such ideal minimal with respect to containing both A and B. 

2. AN Bisa (left, right) ideal which is the maximal such ideal contained in both A 
and B. 

3. The set of all (left, right) ideals of R forma partially ordered set—that is a poset— 
with respect to the containment relation. In view of the preceding two conclusions, 
this poset is a lattice. 


Theorem 7.2.3 (The Correspondence Theorem for Rings) Suppose f : R > Sisan 
epimorphism of rings—that is, S = f (R). Then there is a one-to-one correspondence 


3 R-algebras are certain rings that are also R-modules in a way that is compatible with ring multi- 
plication. Since we have not yet defined R-modules, we cannot introduce R-algebras at this point. 
Nor do we need to do so. R-homomorphisms between rings can still be defined at this point and we 
do require this concept for describing a universal property of polynomial rings on p. 207. 
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between the left (right) ideals of S and the left (right) ideals of R which contain ker f 4 
Under this correspondence, 2-sided ideals are matched only with 2-sided ideals. 


Proof The statement of the theorem is to be read as two statements, one about 
isomorphic posets of left ideals and one about isomorphic posets of right ideals. We 
prove the theorem only for left ideals. The proof for right ideals is virtually a mirror 
reflection of the proof we give here. 

If J is a left ideal of the ring S, set PWD = {re R| f(r) € J}. Clearly if a 
and b are elements of f~!(J), and r is an arbitrary element of R, then 


fray=fMf@eJ 
f(-a)=—-f@eJ 
fat+tb)=f@t foes 


So ra, —a, and a + b all belong to f~!(J). Thus f~!(J) is a left ideal of R 
containing f~'(0s) = ker f. If also J is a right ideal of S, we have JS C J. Then 
ff" (DR) C Jf(R) = JR CJ, whence f-'(J)R C f—!(J). Thus f—!(J) is 
a right ideal if and only if J is. 

Now if J is a left ideal of R, then so is f(/) a left ideal of f(R) = S. But if 
I contains ker f, then | i (f(1)) = TI, because, for any element x € Sia, (f()), 
there is an element i € J such that f(x) = f(i). But then x + (—i) € ker f C J, 
whence x € J. 

We thus see that the operator f~! induces a mapping of the poset of left ideals of 
S onto the set of left ideals of R containing ker f. It remains only to show that this 
(containment-preserving) operator is one-to-one. Suppose J = f~!(J,) = f7! (Jd). 
We claim Jj C J. Ifnot, there is anelement y € J; — J2. Since f is an epimorphism, 
y has the form y = f(x), for some element x in R. Then x is in fh) but not 
fh), a contradiction. Thus we have J; C Jz. By symmetry, J2 C J; and so 
J, = Jz. This completes the proof. 


A SECOND PROOF We can take advantage of the fact that the kernel of the surjective 
ring homomorphism f : R — S is also the kernel of the homomorphism f* : 
(R,+) — (S,+) of the underlying additive groups and exploit the fact that the 
Correspondence Theorem for Groups already presents us with a poset isomorphism 


A(R : ker f) > A(S : 0), 


where A(M : N) denotes the poset of all subgroups of M which contain the subgroup 
N.The isomorphism is given by sending subgroup A of A(R : ker f) to the subgroup 
f(A) of S. 

We need only show that if X is a subgroup of A(R : S). then 


1. RX CX if and only if Sf(X) C f(X). 
2. XR C X if and only if f(X)S C f(X). 


4The latter poset is the principle filter generated by kerf in the poset of left (right) ideals of R. 
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Both of these are immediate consequences of the fact that f is aring homomorphism, 
and S = f(R). 

As in the case of groups, when we say maximal (left, right) ideal, we mean a 
maximal member in the poset of all proper (left, right) ideals. Thus the ring R itself, 
though an ideal, is not a maximal ideal. 


Lemma 7.2.4 Every (left, right) ideal lies in a maximal (left, right) ideal. 


Proof Let P be the full poset of all proper ideals, all proper left ideals or all proper 
right ideals of the ring R. The proof is essentially the same for these three cases. Let 


Ci=JgloeT 


be an a totally ordered subset (that is, an ascending chain) in the poset P. Let U be 
the set-theoretic union of the J,. Then, from the total ordering, any ring element in 
U belongs to some J,. Thus any (left, right) multiple of that element by an element 
of R belongs to U. Thus it is easy to see that RUR, RU, or UR is a subset of U 
in the three respective cases of P. Similarly, -U = U and using the total ordering 
of the chain, U + U C U. Thus U is an (left, right) ideal in R. If U = R, then lpr 
would belong to some J,, whence RJ,, J,R and RJ, R would all coincide with R, 
against J, € P. Thus U, being a proper ideal, is a member of P and so is an upper 
bound in P to the ascending chain C. 

Since any ascending chain in P has an upper bound in P, it follows from Zorn’s 
Lemma (see p. 26) that any element of P is contained in a maximal member 
of P. 


A simple ring is a ring with exactly two (two-sided) ideals: 0 := {Or} and R. (Note 
that the “zero-ring” in which lr = Op is not considered a simple ring here. Simple 
rings are non-trivial). Since every non-zero element of a division ring D is a unit, 0 
is a maximal ideal of D. Thus any division ring is a simple ring. The really classical 
example of a simple ring, however, is the full matrix ring over a division ring, which 
(when the division ring is a field) appears among the examples in Sect. 7.4. 


Lemma 7.2.5 /. An ideal A of R is maximal in the poset of two-sided ideals if and 
only if R/A is a simple ring. 
2. An ideal A of a commutative ring R is maximal if and only if R/A is a field. 


Proof Part 1, is a direct consequence of the Correspondence Theorem 7.2.3 of the 
previous subsection. 

Part 2. Set R := R/A. If R is a field, it is a simple ring, and A is maximal by 
Part 1. If A is a maximal ideal, the commutative ring R has only 0 for a proper ideal. 
Thus for any non-zero element x of R, XR = RX = R. Thus any non-zero element 
of R is a unit. Thus R is a commutative division ring, and hence is a field. The proof 
is complete. 


Suppose now that A and B are ideals in a ring R. In a once-only departure 
from our standard notational convention—special for (two-sided) ideals—we let the 
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symbol AB denote the additive subgroup of (R, +) generated by the set of elements 
{ab | (a,b) € A x B} (the latter would ordinarily have been written as “AB” in our 
standard notation). That is, for ideals A and B, AB is the set of all finite sums of the 
form 


>) ajbj. aj € A, bj € B. 


If we multiply such an expression on the left by a ring element r, then by the left 
distributive laws, each a; is replaced by ra; which is in A since A is a left ideal, 
yielding thus another such expression. Similarly, right multiplication of such an 
element by any element of R yields another such element. Thus AB is an ideal.° 

Such an ideal AB clearly lies in A B, but it might be even smaller. Indeed, if B 
lies in the right annihilator of A, that is the set of elements x of R such that Ax = 0, 
then AB = 0, while possibly AM B is a non-zero ideal (with multiplication table all 
zeroes, of course). 

This leads us to still another construction of new ideals from old: Suppose A and 
B are right ideals in the ring R with B < A. The right residual quotient is the set 


(B: A) := {re R|Ar C B}. 


Similarly, if B and A are left ideals of R, with B < A, the same symbol (B : A) 
will denote the left residual quotient, the set of ring elements r such that rA C B.° 


Lemma 7.2.6 /f B < A is a chain of right (left) ideals of the ring R, then the right 
(left) residual quotient (B : A) is a two-sided ideal of R. 


Proof We prove here only the right-hand version. Set W := (B : A). Clearly W + 
W CW, W = —W, and WR C W, since B is a right ideal. It remains to show 
RW C W. Since A is also a right ideal, 


A(RW) = (AR)W C AW CB, 


so by definition, RW C W. 


A prime ideal of the ring R is an ideal—say P-such that whenever two ideals 
A and B have the property that AB C P, then either A C P, or B C P. (Note: 
here AB is the product of two ideals as defined above. But of course, since P is an 
additive group, AB (as a product of two ideals) lies in P if and only if AB (the old 
set product) lies in P—so the multiple notation is not critical here.) 


Indeed, one can see that it would be a two-sided ideal even if A were only a left ideal and B 
were a right ideal. It is unfortunate to risk this duplication of notation: AB as ideal product, or 
set product, but one feels bound to hold to the standard notation of the literature—when there is a 
standard—even if there are small overlaps in that standard. 

©The common notation (B : A) should not cause confusion. One knows which concept is intended, 
from the fact that one is either talking about right ideals A and B or left ideals, or, in the case that the 
ideals are two-sided, by the explicit use of the word “right” or “left” in front of “residual quotient”. 
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A rather familiar example of a prime ideal would be the set of all multiples of a 
prime number—say 137, to be specific—in the ring of integers. This set is both a 
prime ideal and a maximal ideal. The ideal {0} in the ring of integers would be an 
example of a prime ideal which is not a maximal ideal. 

We have the following analog of Lemma 7.2.5 for prime ideals in a commutative 
ring: 


Lemma 7.2.7 Let R be a commutative ring, and let P be an ideal of R. Then the 
following conditions are equivalent: 


(i) P is a prime ideal; 
(ii) R/P is an integral domain—a commutative ring whose non-zero elements are 
closed under multiplication. 


Proof (i) implies (ii). Suppose P is a prime ideal. Suppose x + P and y + P are 
two cosets of P (that is, elements of R/P) for which (x + P)(y + P) C P. Then 
xy € P and so (xR)(yR) C P. But since R is commutative, xR and yR are ideals 
of R whose product lies in the prime ideal P. It follows that one of these ideals is 
already in P forcing one of the cosets x + P or y+ P to be the zero coset0+ P = P, 
contrary to assumption. Thus the non-zero elements of the factor ring (the non-zero 
additive cosets of P) are closed under multiplication. That makes R/P and integral 
domain. 

(ii) implies (i). Now suppose P is an ideal of the commutative ring R for which 
R/P is an integral domain. We must show that P is a prime ideal. Suppose A and B 
are ideals of R with the property that AB C P. Then (A + P)/P and (B+ P)/P 
are two ideals of A/P whose product ideal (A + P)(B + P)/P = (AB + P)/P 
is the zero element of R/P. Since R/P is an integral domain, it is not possible that 
both ideals, (A + P)/P and (B+ P)/P contain non-zero cosets. Thus, one of these 
ideals of R/P is already the zero coset. That forces A C P or B C P. Since, for 
any two ideals, A and B, AB C P implies A C P or B C P, P must be a prime 
ideal. 


Theorem 7.2.8 Jn any ring, a maximal ideal is a prime ideal. 


Proof Suppose M is a maximal ideal in R and A and B are ideals such that AB C M. 
Assume A is not contained in M. Then A + M = R, by the maximality of M, but 
RB = AB+MB C M,so B belongs to the two-sided ideal (M : R). Since M is a 
left ideal, M also belongs to this right residual quotient. Thus we see that M + B lies 
in this right residual quotient (MV : R). But the latter cannot be R since 1 does not 
belong to this residual quotient. Thus M + B is a proper ideal of R and maximality 
of M as a two-sided ideal forces B C M. 
Thus either A C M or (as just argued) or B C M. Thus M is a prime ideal. 
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7.3 Monoid Rings and Polynomial Rings 


7.3.1 Some Classic Monoids 


Recall that a monoid is a set with an associative operation (say, multiplication) and 
an identity element with respect to that operation. 


Example 36 (FREQUENTLY ENCOUNTERED MONOIDS) 


1. Of course any group is a monoid. 

2. The positive integers Z* under multiplication. The integer 1 is the multiplicative 
identity. 

3. The set N of natural numbers (non-negative integers) under addition. Of course, 
0 is the identity relative to this operation. 

4. M(X), the multisets over a set X. The elements of this monoid consist of all 
mappings f : X — N which achieve a non-zero value at only finitely many 
elements of X. The addition of functions f and g yields the function f + g whose 
value at x is defined to be f(x) + g(x), the sum of two natural numbers (see p. 
xiii). 

When X = {x1,..., Xn} is finite, there are three common ways to represent the 
monoid of multisets. 


(a) We can think of a multiset as a sequence of n natural numbers. In this way, 
the mapping f : X — N is represented by the sequence 


(f@1), £2), +++ fn). 


Addition is performed entry-wise, that is, if {a;} and {b;} are two sequences 
of natural numbers of length v, then 


(a1, 42, ...,4n) + (bi, bo, ..., bn) = (a) +1, a2 + bo, ..., Gy +bn). 


(b) We can regard a multiset f as an inventory of the number of objects of 
each type from a list of types, X. We can then represent this inventory for 
f asa linear combination 5°; f(x;)x;, where two such linear combinations 
are regarded as distinct if they differ at any coefficient. Addition is then 
performed by simply merging the two inventories—in effect adding together 
the linear combinations coefficient-wise. For example if X = {a, b, o}, 
where a = apples, b = bananas, and o = oranges, then (a+2b)+(b+20) = 
a + 3b + 20—that is, an apple and two bananas added to a banana and two 
oranges is one apple, three bananas and two oranges. Thus one can add 
apples and oranges in this system!’ 


7See p. 51 to realize that the positive-integer linear combinations simply encode the mappings 
of finite support that are elements of the monoid of multisets defined there. We used just such a 
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(c) Finally one can represent a multiset over X = {x1,..., Xn, }aSamonomial in 
commuting indeterminates. Thus the mutiset f is represented as a product 


II xf i ) a monomial in the commuting indeterminates {x;}. (Terms with a 

zero exponent are represented by | or are omitted.) For example, if n = 4, 

then Rees a Nate Multiplication here, is the usual multiplication 

of monoids one learns in freshman algebra, where exponents of a common 

“variable” can be added. We denote this monoid by the symbol M”*(X), to 

emphasize that the operation is multiplication. Of course we have a monoid 

isomorphism M(X) — M*(X) in this case.® This isomorphism takes a 

sequence (a1, ..., a) of natural numbers to the monomial a kn, 

5. The Free Monoid over X. This was the monoid whose elements are the words 

of finite length over the alphabet X. These are sequences (possibly empty) of 
juxtaposed elements of X, x1x2---Xm Where m ¢€ N and the “letters” x; are 
elements of X, as well as the empty word which we denote by ¢. The associative 
operation is concatenation of words, the act of writing one word right after another. 
The empty word is an identity element. We utilized this monoid in constructing 
free groups in Chap. 6, p. 166. 
In particular, the free monoid over the single letter alphabet X = {x}, consists of 
ob, X,xx,xxx,... Which we denote as I, x, x7, x3,..., etc. Clearly this monoid is 
isomorphic to the monoid N of non-negative integers under addition (the second 
item in this list). 

6. Let L bea lower semillatice with a “one” element 1. Thus, L is a partially ordered 
set (P, <) with the property that the poset ideal generated by any finite subset of 
P possesses a unique maximal element. Thus if X = {x1,..., Xn} C P, then the 
poset ideal J := {y € Ply < x;, fori = 1,...,n} contains a unique maximal 
element called the “meet” of X and denoted /\ X = x; A+++ A Xn. When applied 
to pairs of elements, “A” becomes an associative binary operation on P. In this 
way (P, A) forms a monoid M(L), whose multiplicative identity is the element 1. 
Similarly one derives a monoid M(P, Vv) from an upper semilattice P with a 
zero element, using the “join” operation. One just applies the construction of the 
previous paragraph to its dual semilattice. 


7.3.2 General Monoid Rings 


Let R be any ring, and let M be a monoid. The monoid ring RM is the set of all 
functions f : M — R such that f(m) 4 0 for only finitely many elements m € M. 


(Footnote 7 continued) 

multiset monoid over the set of isomorphism classes of simple groups in proving the Jordan-Holder 
Theorem for groups. 

8The need for a multiplicative representation (like *(X)) rather than an additive one (like M(X)) 
occurs when a monoid is to be embedded into the multiplicative semigroup of elements of a ring. 
Without it there would be a confusion of the two additive operations. 
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We make RM into a ring as follows. First of all, addition is defined by “pointwise 
addition” of functions, i.e., if f, g € RM, then we define 


(f +g)(m) = f(m)+ g(m) € R, me M. 


Clearly, this operation gives RM the structure of an additive abelian group. Next, 
multiplication is defined by what is called convolution of functions. That is, if f, g € 
RM, then we define 


(f* gm) = D1 fem'\gim") € R, me M. (7.10) 


m'm"=m 


The summation on the right hand side of Eq. (7.10) is taken over all pairs (m’, m") € 
M x M such that m'm"” = m. Note that since f and g are non-zero at only finitely 
many elements of the domain M, the sum on the right side is a sum of finitely many 
non-zero terms, and so is a well-defined element of R. 

If 1 yy is the identity of the monoid M and if | is the identity of R, then the function 
e: M — R defined by setting 


is the multiplicative identity in RM. Indeed, if f ¢ RM, and ifm e€ M, then 


(ex f)(m)= >) etm) fim") 


m'm"=m 


f(m) 


and soe x f = f. Similarly, f * e = f, proving that e is the multiplicative identity 
of RM. 

Next, we establish the associative law for multiplication. Thus, let f, g,h « RM, 
and let m € M. Then, through repeated use of the distributive law among elements 
of R, we have: 


(f *(g*h)\m) = >) fm')(g* hm") 


= >: rim ( be acne) 
= ( > fen’ ah0) 


> fn’ga)h(n") 


m'n'n"=m 
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= (Dd fog Jaa’) 


k’m"=m X\m'n'sk’ 


= Dd) F*g)k)hm") 


kim"=m 


= ((f * g) * h)(m). 


The distributive laws in RM are much easier to verify, and so we conclude that 
(RM, +, *) is indeed a ring. 

One might ask: “Why all this business about mappings M — R? Why not just 
say the elements of RM are just R-linear combinations of elements of M such as 


with only finitely many of the a,, non-zero?” The answer is that we can—but only 
with the following: 


Proviso: linear combinations of elements of M which display different coefficients 
are to be regarded as distinct elements of RM.? 


The correspondence between R-valued functions on the monoid M and finite 
R-linear sums as above is given by the mapping 


i= y f(mym € RM. 


meM 


Since the “Proviso” given above is in force, this correspondence is bijective and is 
easily seen to be a ring isomorphism. 

Writing the elements of RM as R-linear combinations of elements of M some- 
times makes calculation easier. Thus if a = }° am, and 3 = >° byn are elements 
of RM (so only finitely many of the a,, and b,, are non-zero), then 


ab= » os agby, ym. 


méeM \kn=m 


°In mathematics the phrase “linear combinations of a set of objects X” is used, even when those 
objects themselves satisfy linear relations among themselves. So we can’t just say a monoid ring 
RM is the collection of R-linear combinations of the elements of M until there is some insurance 
that the elements of M do not already satisfy some R-linear dependence relation. That is why one 
uses the formalism of mappings M — R. Two mappings differ if and only if they differ in value at 
some argument in M. 
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Note that since R may not be a commutative ring, the a,’s must always appear to the 
left in the products a,b, in the formula presented above. 


7.3.3 Group Rings 


A first example is the so-called group ring. Namely, let G be a multiplicative group 
(regarded, of course, as a monoid), let R be a ring, and call the resulting monoid ring 


RG the R-group ring of G. Thus, elements of RG are the finite sums >° agg where 
gEG 
eacha, € R. (The sums are finite because only finitely many of the coefficients a, are 


non-zero.) Note that the group ring is commutative if and only if R is a commutative 
ring and G is an abelian group.!° 


7.3.4 Mobius Algebras 


Let M be the monoid M(P, Vv) of part 6 of Example 36. Here (P, <) is an upper 
semilatice, with a “zero” element. Thus for any two elements a and b of P, there is an 
element a V b which is the unique minimal element in the filter {y € P|y > a, y => b} 
generated by a and b. Then for any ring R, we may form the monoid ring RM. Since 
M is commutative in this case, the monoid ring is commutative if the ring R is 
commutative. 

In the case that P is finite, Theorem 2.4.2 forces the semilattice P to be a lattice. 
In that particular case, if R is a field F’, then the monoid ring F'M is called a Mobius 
Algebra of P over F and is denoted Ay (P). There are many interesting features of 
this algebra, that can be used to give short proofs for the many identities involving 
the Mobius function of a poset. The reader is referred to the classic paper of Curtis 
Greene [1]. 


7.3.5 Polynomial Rings 


The Ring R[x] 


If M is the free monoid on the singleton set {x}, then we write R[x] := RM and 
call this the polynomial ring over R in the indeterminate x. At this point the reader, 
who almost certainly has had previous experience with abstract algebra, will have 


!0Modules over group rings are the source of the vast and important subject known as “Represen- 
tation Theory”. Unfortunately it is beyond the capacity of a book such as this one on basic higher 
algebra to do justice to this beautiful subject. 
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encountered polynomials, but might not recognize them in the present guise. Indeed, 
polynomials are most typically represented as elements (sometimes referred to as 
“formal sums’) of the form 


ag + aix + anx? +++» + ayx", 


where the “coefficients” ag, a1, ..., @, come from the ring R, and polynomials with 
different coefficients are regarded as different polynomials (the Proviso). If f is 
written as above, and if a, 4 0, then we say that f has degree n. 

We mention in passing that polynomials are frequently written in “functional 


form:” 
n 
iQ) => ax: 
= 


This is arguably misleading, as such a notation suggests that x is a “variable,” ranging 
over some set. This is not the case, of course, as x is the fixed generator of the free 
monoid on a singleton set and therefore does not vary at all. Yet, we shall retain 
this traditional notation as we shall show below—through the so-called “evaluation 
homomorphism’’—that the functional notation above is not unreasonable. 

Note that if R is a ring, and if we regard R C R[x] as above, then we see that R 
and the element x commute. Suppose now that R and S are rings, that R C S and 
that s € Sis an element of S that commutes with every element of R. The subring of 
S generated by R and s is denoted by R[s]. A moment’s thought reveals that every 
element of R[s] can be written as a polynomial in s with coefficients in R; that is, if 
a € R[s], then we can write a as 


n 

i 

a= > ajs', 
i=0 


for some nonnegative integer n, and where the coefficients a; € R. In this situation, 
we can define the evaluation homomorphism Ey; : R[x] > R[s] C S by setting 


E(f@) =f@ = > as", 


i=0 


n . 
where f(x) = >> a;x'. That is to say, the evaluation homomorphism (at s) simply 
i=0 
“plugs in” the ring element s for the indeterminate x. 

The evaluation homomorphism is, as asserted, a homomorphism of rings. To 
prove this we need only show that Es preserves addition and multiplication. Let 
f(x), g(x) € R[x], let f(x) have degree k and let g(x) have degree m. Setting 

n . n 
n = max{k, m} we may write f(x) = >) ajx', g(x) = > bjx' € R[x]. Then 
i=0 i=0 
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Es(f (x) + g@)) = Do @ + bi)s' 


i=0 


n n 
= > ais’ + > bjs' 
i=0 i=0 


= f(s) + g(s) = Es(f(@)) + Es(g())- 


Next, since 
kim 1 
f@9@) = >) >) aby! 
1=0 i=0 
we conclude that 
kim 1 
Es(f(x)g@)) = >) do aiby-is' 
1=0 i=0 


k m 
= (das') ae 
j=0 


i=0 


= f(s)g(s) = Es(f(x)) Es (g(x). 


Therefore, E, : R[x] — S is a ring homomorphism whose image is obviously the 
subring R[s] C S. 


The Polynomial Ring R{X} in Non-commuting Indeterminates 


Let X be a set and let R be a fixed ring. The free monoid on X was defined on p. 
166 and denoted M(X). It consists of sequences (x1, ..., %,),n € N, of elements x; 
of X which are symbolically codified as “words” x1x2 ---X,. (When n is the natural 
number zero, then we obtain the empty word which is denoted ¢.) The monoid 
operation is concatenation (the act of writing one word w2 right after another w, to 
obtain the word w,w72). 

The resulting monoid ring RM(X) consists of all mappings f : M(X) — R 
which assume a value different from 0 € R at only finitely many words of M(X). 
Addition is pointwise and multiplication is the convolution defined on p. 200. Then, 
as described there, RM (X) is a ring with respect to addition and multiplication. This 
ring is called the polynomial ring in non-commuting indeterminates X and is denoted 
R{Xx}.1 

Suppose S is a ring containing the ring R as a subring, and let B be a subset 
of the centralizer in S of R—i.e. B C Cs(R). Suppose a : X — B is some 
mapping of sets. We can then extend a to a function a : M(X) — Cs(R), by setting 


'l This notation is not completely standard. 
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a(w) = a(x1)a(x2)---a(%,) exactly when w is the word x1x2---Xn, and setting 
a(d) = Ir, the multiplicative identity element of the ring R as well as Cs(R). 

For each function f : M(X) — R which is an element of the monoid ring 
RM(X), define 


EVD) = Demat wa): (7.11) 


Since f(w) can be non-zero for only finitely many words w of the free monoid 
M(X), the quantity on the right side of Eq.(7.11) is a well-defined element of S. 
[Of course we can rewrite all this in a way that more closely resembles conventional 
polynomial notation. Following the discussion on p. 202 we can represent the function 
f € RM(X) in polynomial notation as 


7 ie De Supp pF) ees 


where Supp f is the “support” of f/—the finite set of words w for which f(w) ¢ 
0 € R. Then 


EX(f) = EX(pp) = >) f (w)a(w), 


we Supp f 
a finite sum of elements of RCs(R) C S. | 

In Exercise (6) in Sect.7.5.3 the student is asked to verify that EX : RM(X) = 
R{X} — Sis a ring homomorphism, and is the unique ring homomorphism from 
RM(X) to S that extends the mapping a: X > B. 


The Polynomial Ring R[X] over a Set of Commuting Indeterminates 


We have defined a multiplicative version of the monoid of multisets over X, namely 
M*(X) (see Example 36, part 4(c)). The elements of this monoid are the mul- 


tiplicative identity element 1 and all finite monomials products []j_, a , where 
{x1,..., X,} May range over any finite subset of X. There is a natural monoid homo- 
morphism 

deg : M*(X) > (N, +) 
which is defined by 


n n 

qi 
HE? is > dj. 
i=1 i=l 


The image deg m is called the degree of m, form € M*(X). 

The monoid ring RM*(X)), is denoted R[X] in this case. As remarked, its ele- 
ments can be uniquely described as formal sums >" am,m where m € M*(X), 
dm € R, and only finitely many of the coefficients a, are non-zero. The adjec- 
tive “formal” is understood to mean that two such sums are distinct if and only they 
exhibit some difference in their coefficients (our Proviso from p. 201). These sums 
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are called polynomials. Since M*(X) is acommutative monoid, the polynomial ring 
R[X] is a commutative ring if and only if R is commutative. 

A polynomial is said to be homogeneous if and only if all of its non-zero monomial 
summands possess the same degree, or if it is the zero polynomial. This common 
degree of non-zero summands of a homogeneous polynomial h is called the degree 
of that polynomial, and is again denoted deg h. 

Any polynomial p = >° am can be expressed as a sum of homogeneous poly- 
nomials p = >°,h; where h; has degree i € N, simply by grouping the mono- 
mial summands according to their degree. Clearly, the uniqueness of the coeffi- 
cients a,, which define p tells us that decomposition of an arbitrary polynomial as 
a sum of homogeneous polynomials is unique. The degree of a non-zero polyno- 
mial p = Sy h; € R[X] is defined to be the highest value d = deg h; achieved 
for a non-zero homogeneous summand in p. By a further extension of notation, it 
is denoted deg p. The convention is that the zero polynomial (where all a, = 0) 
possesses every possible degree in N. 

Recall from p. 187 that an integral domain is a commutative ring in which the 
non-zero elements are closed under multiplication. 


Lemma 7.3.1 /f D is an integral domain, then so is the polynomial ring D(X]. 


Proof The proof proceeds in three steps. 
(Step 1) If D is an integral domain and X = {x}, then D[x] is an integral domain. 


For any non-zero polynomial p = >> a; x! (a finite sum), the lead coefficient is 
the coefficient of the highest power of x appearing among the non-zero terms of 
the sum. The distributive law alone informs us that the lead coefficient of a product 
of two non-zero polynomials is in fact the product of the lead coefficients of each 
polynomial, and this is non-zero since D is an integral domain. Thus the product of 
two non-zero polynomials cannot be zero, and so D[x] is an integral domain. 


(Step 2) If X is a finite set, and D is an integral domain, then D[X] is also an 
integral domain. 


We may suppose X = {x1,...,X,} and proceed by induction on n. The case 
n = | is Step 1. Suppose the assertion of Step 2 were true for — 1. Then D! := 
D[x1,..-,Xn—1] is an integral domain. Then D[X] = D’[xp] is an integral domain 
by applying Step 1 with D’ in the role of D. 


(Step 3) IfX is an infinite set and D is an integral domain, then D[X] is an integral 
domain. 


Consider any two non-zero polynomials p and qg in D[X]. Each is a finite D-linear 
combination of monomials and so both p and gq lie in a subring D[X’] where X’ is 
a finite subset of X. But D[X’] is itself an integral domain by Step 2, and so the 
product pq cannot be zero. 

The proof is complete. 
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Theorem 7.3.2 (Degrees in polynomial domains) [f D is an integral domain, then 
D[X] possesses the following properties: 


I. Let D; denote the set of homogeneous polynomials of degree i. (Note that this 
set includes the zero polynomial and is closed under addition) The degree of the 
product of two non-zero homogeneous polynomials is the sum of their degrees. In 
general, the domain D[X] possesses a direct decomposition as additive groups 
(actually D-modules) 


D[X]}=Do®Di 6 D2 8---PDnG--- 


where D; Dj © Dj+;- 

2. For arbitrary non-zero polynomials p and q (whether homogeneous or not) the 
degree of their product is the sum of their degrees. The group of units consists of 
the units of D (embedded as polynomials of degree zero in D{X]). 

3. An element of D[X] is said to be irreducible if and only if it cannot be expressed 
as a product of two non-units. (Notice that this definition does not permit an 
irreducible element to be a unit.) Jf D = F is a field, then every non-zero non- 
unit of F(X] is a product of finitely many irreducible polynomials. '* 


Proof Part 1. Let hy and h2 be non-zero homogeneous polynomials of degrees d; and 
dz, respectively. Then all formal monomial terms delivered to the product h,h2 by 
use of the distributive law, possess the same degree d; +d2. But can all the coefficients 
of these monomials be zero? Such an event is impossible since, by Lemma 7.3.1, 
D[X] is an integral domain. The decomposition of D[X] as a direct sum of the D; 
follows from the uniqueness of writing any polynomial as a sum of homogeneous 
polynomials. 

Part 2. The degree of any polynomial is the highest degree of one of its homoge- 
neous summands. That the degrees add for products of homogeneous polynomials, 
now implies the same for arbitrary polynomials. It follows that no polynomial of 
positive degree can be a unit of D[X], and so all units are found in D ~ Dp, and are 
thus units of D itself. 

Part 3. Let D = F bea field. Suppose p € D[X] were not a finite product of 
irreducible elements. Then p is itself not irreducible, and so can be written as a 
product p; p2 of two non-units. Suppose one of these factors—say p,, had degree 
zero. Then p; would be a non-zero element of F’. But in that case, p; would be a 
unit of F[X] contrary to its being irreducible. Thus p; and p2 have positive degrees, 
and by Part 2, these degrees must be less than the degree of p. An easy induction on 
degrees reveals that both of the p; are finite products of irreducible elements, and 
therefore p = pj p2 is also such a product. 


Clearly the polynomial ring R[x] studied in an earlier part of this section is just 
the special case of R[X] where X = {x}. For R[X] there is also a version of the 
evaluation homomorphism E, of R[x]. The reader is asked to recall the definition of 
R-homomorphism which was introduced on p. 193. 


'2'4 much stronger statement about factorization in D[X] is investigated in Chap. 10. 
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Theorem 7.3.3 (The Universal Property of Polynomial Rings) Suppose B is a com- 
mutative ring containing R as a subring. Let X = {x,...,Xn}, andlet@: X > B 
be any function. Then there exists an R-homomorphism 


o: R[X] > B 
which extends ¢. 


Proof For each monomial m = []}_, x;" set o(m) = TTje1 ¢@i), which is a 
unique element of B. For each polynomial p = >” a,m, where m is a monomial in 
M*(X), we set 


3p): = >. am dm). (7.12) 


In other words, to effect d, one merely substitutes ¢(x;) for x;, leaving all coefficients 
the same. Note that when m is the unique monomial in the x; of degree zero (i.e. all 
the exponents a; are zero) them m = 1, the multiplicative identity element of R[X]. 
Then by definition 


GCL) = b(xpxd ---xP) = [] o@i)° = Ie, 


i=1 


the multiplicative identity element of the ring B. 
For all polynomials p, pi, p2 € R[X], and ring elements r € R, one has 


d(rp) = rd(p) 
b(pi + pr) = o(pi) + b(p2) 
(pi p2) = b(p1)A(po). 


The first equation follows from the definition of b given in Eq. (7.12). The next two 
equations are natural consequences of ¢ being a substitution transformation. 


Since B is commutative, it follows that ¢ : R —> B is an R-homomorphism as 
defined on p. 193. 


7.3.6 Algebraic Varieties: An Application of the Polynomial 
Rings F(X] 


This brief section is a side issue. Its purpose is to illustrate the interplay between 
polynomials in many commuting indeterminates and other parts of Algebra. The 
evaluation homomorphism plays a key role in this interplay. 
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Let n be a positive integer and let V be the familiar n-dimensional vector space 
F of all n-tuples over a field F. Since the field F is commutative, the ring F[X] := 


F[x1, X2,...,Xy] of all polynomials in n (commuting) indeterminates {x,,..., X,} 
having coefficients in the field F’, is a commutative ring. 
Recall from Theorem 7.3.3 thatif f isa mapping taking x; toa; € F,i =1,...,n, 


we obtain an F-homomorphism 
f: F[X]—- F. 


Since f is defined by substituting a; for x;, it makes sense to denote the image of 
polynomial p under this mapping by the symbol p(aj, ..., dy). 

Notice that for every vector v = (aj,...,d,) € V, there exists a unique map- 
ping a(v) : X — F taking x; to a; and so, by Theorem 7.3.3, there exists an 
F-homomorphism of rings: G(v) : F[X] — F. Thus each polynomial p defines 
a polynomial function e(p) : V — F taking v to @(v)(p)—that is, it maps vector 
v = (aj, ..., Gn) to the field element p(qa), ..., a,). (Harking back to our discussion 
in one variable, p is a polynomial, not a function, but with the aid of the evaluation 
mappings, it now determines a polynomial function e(p) : V > F.) 

The vectors v such that e(p)(v) = 0 € F are called the zeroes of polynomial p. 
Collections of vectors in V that are the common zeroes of some set of polynomials 
is called an affine variety. 

Suppose we begin with an ideal J in F[X]. Then the variety of zeroes of J is 
defined to be the subset 


V(J) = {((41,---, dn) € V| p(a,.--, 4) = O forall p € J}. 

In other words, V(J) is the set of vectors which are zeroes of each polynomial 
P(X1,.--,Xn) in J. Clearly, if we have ideals J C J C F[X], then VU) D> V(J), 
and so we have an order-reversing mapping 

Y : the poset of ideals of F[X] — the poset of subsets of V. 
Conversely, for every subset X C V, we may consider the collection 
ZT(X) := {[p(,..-,%) € FLX]| p(q,...,a,) = 0 for all (a1, ...,a,) € X}. 
Clearly Z(X) is an ideal of F[X] for any subset X C V and thatif Y C X C V, 
then Z(Y) > Z(X). Therefore we have an order-reversing mapping of these posets 


in the opposite direction: 


Z: poset of subsets of V — poset of ideals of FLX]. 
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Moreover, one has the monotone relations 


J :=TV(J)) = J 
X := V(I(X)) < X 


for every ideal J C F[X] andevery subset X C V. Thus the pair of mappings forms a 
Galois connection in the sense of Chap. 2, and the two “bar” operations, ideal J > J, 
and subset X — X, are closure operators (idempotent order preserving maps on a 
poset). Their images—the “closed” objects—are a special subposet which defines 
a topology on the respective ambient sets. If we start with an ideal J, its closure J 
turns out to be the ideal 


VJ:= {x € F[X]|x” © J, for some natural number n}. 


This is the content of the famous “zeroes Theorem” of David Hilbert.!* An analogue 
for varieties would be a description of the closure X of X in terms of the original 
subset X. These image sets X are called (affine) algebraic varieties. That there is 
not a uniform description of affine varieties in terms of V alone should be expected. 
The world of “V” allows only such a primitive language to describe things that we 
cannot say words like “nilpotent”, “ideal”, etc. That is why the Galois connection is 
useful. The mystery of these “topology-inducing” closed sets can be pulled back to 
a better understood algebraic world, the poset of ideals. 

Is the new topology on V really new? Of course, it is not ordained in heaven that 
any vector space actually has a topology which is more interesting than the discrete 
one. In the case of finite vectors spaces, one cannot expect anything better, and indeed 
every subset of a finite vector space is an affine algebraic variety. Also, if F is the field 
of complex numbers, the two topologies coincide, although that takes a non-trivial 
argument. But there are cases where this topology on algebraic varieties differs from 
other standard topologies on V. 

Every time this happens, one has a new opportunity to do analysis another way, 
and to deduce new theorems. 


7.4 Other Examples and Constructions of Rings 


7.4.1 Examples 


Example 37 CONSTRUCTIONS OF FIELDS Perhaps the three most familiar fields are 
the rational numbers Q, the real numbers R, and the complex numbers C. In addition 


Note that /J, as defined, is not necessarily an ideal when F is replaced by a non-commutative 
division ring. 
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we have the field of integers modulo a prime. Later, when we begin the Galois theory 
and the theory of fields, we shall meet the following ways to construct a field: 


1. Forming the factor ring D/M where M is a maximal ideal in an integral domain 
D. 

2. Forming the ring of quotients of an integral domain. 

3. Forming the topological completion of a field. 


The first of these methods is already validated in Theorem 7.2.5. However, making 
use of this theorem requires that we have sufficient machinery to determine that 
a given ideal is maximal. Using the Division Algorithm and the Euclidean Trick 
(Lemmas 1.1.1 and 1.1.2, respectively), we can flesh this out in the ring Z of integers, 
as follows. 

First of all, let 7 be an ideal in Z. We shall show that J is a principal ideal, that 
is, 1 = Zn := {mn | m € Z} for some integer n. If J = 0, then it is already clear that 
I = ZO. Thus, we assume that J 4 0. Since x € J implies that —x € J, we may 
conclude that 1M Z*+ # @.. 

Thus, we let 7 be the least positive integer in /; we claim that J = Zn. Suppose 
x € I. By the Division Algorithm there are integers g and r with x = qn+r, 0< 
r <n.Sincer = x —qn € I, we infer that r = 0, by our minimal choice of n. Thus 
x = qn isa multiple of n. Since x was an arbitrary member of /, one has J = Zn. 

Next, we shall show that the ideal J = Zp is a maximal ideal if and only if p 
is prime. To this end, let p be prime and let J be an ideal of Z containing Zp. If 
Zp © I, then there exists an element x in the set J — Zp. Thus, we see that p jx. 
We can then apply the Euclidean Trick to obtain integers s and t with sp + tx = 1. 
But as p, x € J, we find that 1 € /, forcing J = Z. Therefore, when p is prime, the 
ideal Zp is maximal. The converse is even easier; we leave the details to the reader. 

The complex numbers C is also an example of this sort. The integral domain 
R[x] also possesses an analogue of the Euclidean algorithm utilizing the degree of a 
polynomial to compare the remainder r with a in the standard equation a = gb+r 
produced by the algorithm. As a result, an argument similar to that of the previous 
paragraph employing the Euclidean trick shows that the principal ideal R[x] p(x) 
of all multiples of the polynomial p(x), is a maximal ideal whenever p(x) is a 
polynomial that is prime in the sense that it cannot be factored into two non-units 
in R[x]—that is, p(x) is an irreducible polynomial in R[x]. Now the polynomial 
x* + 1 cannot factor into two polynomials of R[x] of degree one, so it is irreducible 
and the ideal ((x2 + 1)) := R[x](x? + 1) is therefore maximal. The factor ring 
F =R[x]/ ((x2 + 1)) is therefore a field. Each element of the field is a coset with 
the unique form a + bx + ((x2 + 1)), a,b € R. Setting i := x + ((x* + 1)) and 
1 = 1+ ((x2 + 1)) we see that 


(i) 1 is the multiplicative identity of F, 
(ii) i2 = —1, and that 
(ili) every element of the field F has a unique expression as a1 + bi for real numbers 
a and b. 
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One now sees that F conforms to the usual presentation of the field C of complex 
numbers. 

The field Q on the other hand is the ring of quotients of the ring of integers 
Z, while R is the topological completion of the order topology on Q (its elements 
are the familiar Dedekind cuts). Thus each of the most commonly used fields of 
mathematics, Q, IR, and C, exemplify one of the three methods of forming a field 
listed at the beginning of this example. 


Example 38 RINGS OF FUNCTIONS. Let X be a set and let R be aring. Set R* := 
{functions X — R}. For f, g € R*, we define their point-wise sum and product by 
setting 


(Ff + 9)@) = FQ) +9), FI) = F@)9@), 


where the sum and product on the right-hand sides of the equations above are com- 
puted in the ring R. The additive identity is the function mapping X identically to 
0 € R; the multiplicative identity maps X identically to | € R. One easily checks 
the remaining ring properties in R*. 


Example 39 DIRECT PRODUCT OF RINGS. Suppose {Ro} ce, is a family of rings 
indexed by 7. Let 1, be the multiplicative identity element of R,. Now we can can 
form the direct product of the additive groups (R,, +), 0 € I: 


P:=| [(Ro, H. 


oel 


Recall that the elements of the product are functions f : J > U,¢,; Mo (disjoint 
union), with the property that f(a) € R, for each o € J. We can then convert this 
additive group into a ring by defining the multiplication of two elements f and g of 
P by the rule that fg is the function J > ),.; Ro defined by 


(Fao) = f()g(o), 7 €T, 


where the juxtaposition on the right indicates multiplication in the ring R,. (This 
sort of multiplication is often called “coordinate-wise multiplication.” Note also that 
this example generalizes Example 38 above.) It is an easy exercise to show that this 
multiplication is associative and is both left and right distributive with respect to 
addition in P. Finally, one notes that the function 1p whose value at o is 1,, the 
multiplicative identity of the ring R,, o € J, is the multiplicative identity in P; so 
P is indeed a ring. 

Now that we understand a direct product of rings, is there a corresponding direct 
sum of rings? The additive group (S,+) of such a ring should be a direct sum 
®7(Ro, +) of additive groups of ring. But in that case, the multiplicative identity 
element 1s of the ring S must be uniquely expressible as a function which vanishes 
on all but a finite subset F of indices in 7. But if multiplication is the “coordinate- 
wise multiplication” of its ambient direct product, we have a problem when J is 
infinite. For in that case, there is an indexing element o € J — F, and there exists 
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a function pz which vanishes at all indices distinct from o, but assumes the value 
1, at o. Then we should have ls - p> = 0 ¥ pz contradicting the assumption that 
1s was the identity element. Since our definition of ring requires the existence of a 
multiplicative identity we see that any infinite direct sum of rings, 


®1Ro = Or kz, 


although possessing most of the properties of a ring, is not actually a ring itself. This 
is an expense one pays in maintaining a multiplicative identity. 


Example 40 OPPOSITE RINGS. Suppose we have aring R with a given multiplication 
table. We can define a new multiplication (let’s call it “o”) by transposing this mul- 
tiplication table. That is, a o b := ba for all a and b in R. The new operation is still 
distributive with respect to addition with both distributive laws simply transposed, 
and it is easy for the student to verify that (R, +, o) is aring. (Remember, we need a 
multiplicative identity, and the associative law should at least be checked.) We call 
this ring the opposite ring of R and denote it by the symbol Opp(R). 

One may now observe that any antiautomorhism of R is essentially an isormor- 
phism R — Opp(R). 


Example 41 ENDOMORPHISM RINGS. Let (A, +) be an additive abelian group and 
let End(A) := Hom(A, A) denote the set of all group homomorphisms A — A. 
We give End(A) the structure of a ring as follows. First of all, addition is defined 
point-wise, ie., (f + g)(a) := f(a)+ g(a), f.g € End(A), a € A. The reader 
should have no difficulty in showing that End(A) is an abelian group relative to the 
above addition, with identity being the mapping that sends every element of A to the 
additive identity 0 € A. 

The multiplication in End(A) is just function composition fg := fog: A— A, 
which is always associative. The identity function 14 which takes every element a € 
A to itself is clearly the multiplicative identity. Next, note that if f, g,h € End(A), 
and if a € A, then 


f(g t+ hy@) = f(g@ + h(a) = fg@ + fra) = (fo + fry), 


Similarly, one has the right distributive law, and so it follows that End(A) is a ring 
relative to the above-defined addition and multiplication. 

Incase A is aright vector space over the field F', we may not only consider End(A) 
as above (emphasizing only the structure of A as an additive abelian group), but we 
may also consider the set 


End,-(A) := {F-linear transformations f : A —> A}. 
Since the point-wise sum of F-linear transformations is again a F-linear transfor- 


mation, as is the composition of two F-linear transformations, we conclude that 
End f(A) is, in fact, a subring of End(A). 
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When the F’-vector space A has finite dimension n, then by choosing a basis, every 
linear transformation of A into itself can be rendered uniquely as an n x n matrix 
with coefficients in F. Then addition and composition of linear transformations is 
duplicated as matrix addition and multiplication. In this guise, End (A) is seen to 
be isomorphic to the full matrix ring over F—the ring M, (F) of all n x n matrices 
over F,, with respect to matrix addition and multiplication. 

Now suppose a € Aut(F), the groups of automorphisms of F. Then o determines 
abijectionm(c) : M,(F) > M;,(F) which takes each matrix A = (q;;) to AMO) = 
((a;j)°). Recall that for matrices A = (a;;) and B = (b;;), the (i, j)-entry of AB is 
the sum >", ajxbx;. Observing that applying the field automorphism co to this sum is 
the same as applying it to each a;, and bx; one sees that 


(ABy”® = AM@ pmo). 


Similarly m(c) preserves sums of matrices and so”) is an automorphism of the 
matrix ring M,(F). 

Next define the transpose mapping T : M,(F) — M,(F) which takes each 
matrix A = (aj;;) to its transpose Al := (a;;)—in other words, the entries of the 
matrix are reflected across its main diagonal. Now if A = (a;;) and B = (b;;) are 
matrices of M,,(F) it is straightforward to check that both (AB)! and Bf AT possess 
the same (i, j)th entry, namely >°,.ajxbx; (see Example 35, p. 190). Since it is clear 
from the definition of T that it preserves addition, one now sees that the transpose 
mapping is an antiautomorphism of the ring M,,(F). 

We finally ask the reader to observe that the antiautomorphism T and the auto- 
morphism m(o) are commuting mappings on M,,(F’). Thus (from the data given at 
the beginning of Sect.7.2) the composition of these two mappings in any order is 
one and the same antiautomorphism of the ring M,(F). 

The ring M,,(F) is a simple ring for every positive integer n and field F. We will 
not prove this until much later; it is only offered here as a “sociological” statement 
about the population of mathematical objects which are known to exist. It would be 
sadistic to propose definitions to students which turn out to be inhabited by nothing— 
providing yet one more description of the empty set. So there is some value in pointing 
out (without proof) that certain things exist. 


Example 42 QUATERNIONS. As in the previous Example 37, we consider the field 
C of complex numbers with each complex number uniquely represented in the form 
a+bi,a,b € R, where i? = —1 € R. Complex conjugation is the bijection C > C 
which maps each complex number a = a + bi, a,b € R to its complex conjugate 
@ = a — bi. One can easily check that for any complex numbers a and 3 


a+8=a+B (713) 
afb = ap (7.14) 


Il 
Q| 


so that complex conjugation is an automorphism of the field C. 
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It is useful to define the “norm” N(a) of a complex number as the real number 
aa. Thus, if a = a + bi, then N(qa) = a+ b?, which is always a non-negative real 
number, and is zero only if a = b = 0 = a. Moreover, since complex conjugation 
is an automorphism of the complex number field, we have 


N(aZ) = N(a)N(@) for alla, GE R. 


Next let H be the collection of all 2-by-2 matrices of the form 


h(a, B) := (58). 


where a and (@ are complex numbers. Clearly by Eq. (7.13), H is an additive subgroup 
of the full group M2(C) of all 2-by-2 matrices with entries in C under matrix addition. 
But HH is actually a subring of the matrix ring M2(C), since 


(<5) (455) -( G5 Seer) 
—-Ba)\-67)  \—8y- a6 —66 + ay 


by ordinary matrix mutiplication. Thus, using Eq. (7.14), we have 


h(a, B)h(y, 6) = h(ay — 86, ad + BF). (7.15) 


Clearly the 2-by-2 identity matrix /2 is the multiplicative identity of the ring H. 
Notice that the complex-transpose mapping defined by 


apy .( @=o 
—Ba ~ —B a 
induces an antiautomorphism 7 : H — H (see Example 35, p. 190). Thus h(a, 3)7 = 


Next one may notice that 


D(h(a, B)) = h(a, B)h(a@, 8)’ = (N(a) + N(B))hb, (7.16) 


the sums of the norms of two complex numbers times the 2 x 2 identity matrix. Since 
such norms are non-negative real numbers, the sum can be zero only if each summand 
norm is zero, and this forces a = 3 = 0 € C so that h(a, (9) is the 2-by-2 zero 
matrix, the zero element of the ring H. Thus if h(a, 3) 4 0 € H, it is an invertible 
matrix of Mz(C), whose inverse is given by 


h(a, 3)~'| = h(d~'a, d~'(—B)), where d = (N(a) + N(8)), 
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and this inverse clearly belongs to H. Thus we see that every non-zero element of 
H is a unit in H. Thus H is a division ring. The reader may check that it is not a 
commutative ring, and so the ring H—called “the division ring of real quaternions” — 
is our first example of a division ring which is not a field. (The student is encouraged 
to peruse Exercise (7) in Sect. 7.5.4.) 


Example 43 THE DIRICHLET ALGEBRA AND MOBIUS INVERSION. This example 
comes from Number Theory. Let P = (Z*, |) denote the divisor poset on the set of 
positive integers. Thus one writes a|b if integer a divides integer b, a, b, € Z*. 

Let A := Hom(Z*, C) denote the set of all functions from the positive integers 
into the complex numbers, viewed as a vector space over C. A binary operation “x” 
is defined on A in the following way: if f and g are functions, then f * g is defined 
by the formula 

(fegin:= >) fldig@). 


d\,dx;d\dy=n 


It is easily seen that “sx” is distributive with respect to vector addition and that 


[(fegehiny= >) fdig(aa)h(ds) 
d\,d2,d3;d,\d2d3=n 
=[f *(g*h)] 


so the associative law holds. Define the function € by 
e(1) = 1 ande(n) = 0, ifn > 1. 


Thenex f = f*xe= f forall f € A. Thus A = (A, +.*) is a commutative ring 
(actually a C-algebra) whose multiplication * is called “Dirichlet multiplication” 
(see [23], p. 23). 

The zeta function is the constant function ¢ such that 


C(n) = 1, foralln € Zt. 


The zeta function is a unit in the ring A, and its inverse, ju is called the Mobius 
function. The definition of the Mobius function depends on the fact that every positive 
integer n is uniquely expressible (up to the order of the factors) as a product of positive 
prime numbers, 1 = pi --. py’ .!4 (In the case n = 1 all exponents a; are zero and 
the product is empty.) Then (7) is given by: 


p(n) = 1, ifn =1 
p(n) = Oif some a; > 2, 
p(n) = (—1)"( when each a; = 1). 


'4 4 consequence of the Jordan-Hélder Theorem. 
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Lemma 7.4.1 [fn > 1 


> H@) = 0, 


d\n 


where the sum is over all positive integer divisors of n. 


Proof Suppose n = ie - ++ py”, where the pj; are positive primes. Then 


d|n (€] 5.05 er) 
= tart (5) $e 40" 
=(1 17 So. 


Lemma 7.4.2 wx C=C * =e. 


Proof By the definition of Dirichlet multiplication, (4 * ¢)(1) = ¢(1) - wd) = 1. If 
n>-l, 


(u* O(n) = D7 ud) = 0 


d\n 


by Lemma 7.4.1. 


Theorem 7.4.3 (MOBIUS INVERSION) /f the functlons f and g are related by the 
equation 


gn) = >» Ff (d) for alin, (7.17) 
d|n 
then 
fn) = >) gd) uln/d) = >) nd) g(n/a) (7.18) 
d|n d|n 


for all positive integers n. 


Proof Equation (7.17) asserts the g = ¢« f. Since Cis aunit, f = (~!*g = pg. 


Remark In the 1930s the group-theorist Phillip Hall recognized that the theory of 
the Dirichlet Algebra and its Mobius function could be generalized to more general 
partially ordered sets. This idea was fully expounded by Gian-Carlo Rota using an 
arbitrary locally finite poset to replace the integer divisor poset P = (ZT, |) (see 
[2]). Here the analogue of the Dirichlet Algebra is defined over an arbitrary commu- 
tative ring and is call the Incidence Algebra in this context.'> The zeta function and 


'SThe incidence algebra in general is not commutative. The commutativity of the Dirichlet algebra 
was a consequence of the fact that the poset of divisors of any integer n is self-dual. 
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its inverse, the Mobius function, are still defined for the incidence algebras. Beyond 
this, Rota’s theory exposed a labyrinth of unexpected identities involving the Mobius 
function. More on this interesting subject can be harvested from [1] and [2]. 


7.4.2 Integral Domains 


Let D be any integral domain. Recall from p. 187 that this term asserts that D is a 
commutative non-zero ring such that if a,b € D are both nonzero elements, then 
ab # 0. In other words, D* := D\{0} is closed under multiplication (and hence, 
from the other ring axioms, D* is a multiplicative monoid). 


The Cancellation Law 


A very useful consequence of the definition of integral domain is the following: 


Lemma 7.4.4 (The cancellation law) Let D be an integral domain. Suppose for 
elements a, b,c in D, one has ab = ac. If a is non-zero, then b = c. 


Proof Indeed, ab = ac implies that a(b — c) = 0. Since a is non-zero, and non-zero 
elements are closed under multiplication, one must conclude that b — c = 0, forcing 
b=c. 


Domains Which Are Not Like the Integers 


Obviously, the example par excellence of an integral domain is the ring of integers. 
But perhaps that is also why it may be the least representative example. 

The integers certainly enjoy every property that defines an integral domain. 

But there are many further properties of the integers which are not shared by many 
integral domains. For example, we already showed in Example 37 above that every 
ideal J of the ring of integers is a principal ideal. Such an integral domain is called 
a principal ideal domain. Other elementary examples of principal ideal domains 
include 


Zli] = {a+ bi € C| a,b € Z} (the Gaussian integers); 
Z[w] := {a+ bw € C| a, b € Z} (the Eisenstein numbers) 
(Here, w = (—1 + /—3)/2, a root of x7 +x = 1.) 

Z| /—2] = {a+ b/—2 € Cl a,b € Z}. 

F [x], where F is any field. 


We shall postpone till Chap. 9 the proofs that the above integral domains are actually 
principal ideal domains. 

One the other hand, we shall give an example of a perfectly respectable integral 
domain which is not a principal ideal domain. This is the domain 
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D:= {a+ b/—5| a,b € Z}. 


Now consider the ideal J = 3D + (2 +./—5)D. We shall show that D is not a 
principal ideal. We start by showing that J #4 D. Were I equal to D, then we would 
have | € J, forcing the existence of elements x = a+b/—-5,y=c+ d/—5 € D, 
a,b,c,d € Zwith 


1=3x+(2+VJ-—5)y = Ga+ 2c — 5d) + 3b+2d+c)W-S. (7.19) 
Thus 


3b+c+2d=0 (7.20) 
3a+2c—Sd=1. (7.21) 


Then c = —3b — 2d, and substitution for c in Eq. (7.21) yields 
3a — 6b —9d = 1. 


But this is impossible, as | is not an integer multiple of 3. Therefore, J is a proper 
ideal in D. 

Now, by way if contradiction, assume J is a principle ideal, so that J = CD, for 
some element ¢ € D. Notice that complex conjugation induces an automorphism of 
D taking any element z = a + b./—5D with a,beZtoz:=a— b,/—5. Then the 
norm of z, which is defined by N(z) := zZ = a + 5b” isa mapping N : D > Z 
which preserves multiplication. Precisely, 


N(yz) = yzyz = yzyz = (yy) (zz) = Ny) N(z). 


Thus if 7 = ¢D, then the norm of every element of the ideal J must be a multiple of 
N(¢). Now J = 3D+ (2+~/—5)D contains these two elements: 


a= 254 5) = 1=W7>5 
9 Oey 5) aa Ss 


of norms 6 and 21, respectively. It follows that ¢ = e + f/—5 (e, f € Z), must 
have norm 1, or 3. If N(C) = ee + Sf = 1, then ¢ = +1. But that would force 
I = D which we have already shown is impossible. Thus, N(¢) = 3, which is also 
impossible, since 3 is not of the form n* +5m? for any integers m and n. We conclude 
that J cannot be a principle ideal. 
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As a matter of fact we have 9 = 3.3 = (2+./—5)(2 —/—5), a factorization into 
irreducible elements (elements that are not a product of two or more non-units) in 
two distinct ways. So the ring D does not have the property of unique factorization 
into primes that is enjoyed by the integers. 


The Characteristic of an Integral Domain 


Suppose now that D is an integral domain. We define the characteristic of D as the 
smallest natural number n such that 


nx =x+x+---x=0 
——— 


n terms 


for every element x in D. This already says that o(x) divides n for every x € D, 
where o(x) denotes the order of x in the additive abelian group (D, +). Note that if 
x 4 0 then 

etat-:-+x=x0+1+---+1)=9, 


o(1) terms o(1) terms 


from which we conclude that o(x) divides o(1). Conversely, from 


xAd+14+---+)I)=x+x+---+x=0, 
——— ee 


o(x) terms o(x) terms 


together with the fact that D is an integral domain, we see that (since x # 0) 
1+1+---+1 = 0 (o(%) terms), and so it follows also that o(1)|o(x). Thus 
o(x) = o(1) for every x € D. 

Note finally that if o(1) is not a prime and is written as o(1) = km for natural 
numbers m and n, then 0 = km1 = (k1)(m1) forcing kl = 0 or m1 = 0, contradict- 
ing that all nonzero elements of D possess a constant additive order, as established 
in the previous paragraph. 

Therefore, o(1) must, in fact, be prime. 

We have therefore shown that 


Lemma 7.4.5 For any integral domain D, one of the following two alternatives 
hold: 


(i) | No nonzero element of D has finite additive order, or 

(ii) There exists a prime number p such that px = 0 for every element x of D. 
Moreover, for every integer n, such that0 <n < p, and every nonzero element 
xe D,nx £0. 


In the former case (i) we say that D has characteristic zero, while in the latter 
case (ii) we say that D has characteristic p. 
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It should be clear that all subrings of an integral domain of characteristic c (c = 0 
or c = p, for some prime p) are integral domains of the same characteristic c. In 
particular, if D is any subring of the complex numbers C, then it is an integral domain 
of characteristic 0. Thus, the Gaussian Integers, the Eisenstein numbers, or any other 
subring of the reals or complex numbers, must be an integral domain of characteristic 
0. The field Z/pZ, where p is a prime number, clearly has positive characteristic p. 

The following lemma shows that integral domains of non-zero characteristic p 
may possess certain endomorphisms. It will be crucial for our analysis of finite fields 
and inseparable extensions in Chap. 11. 


Lemma 7.4.6 Suppose D is an integral domain of positive characteristic p, a prime 
number. The pth power mapping D — D, which maps each element r € D to its 
pth power r?, is an endomorphism of D. 


Proof Since any integral domain D is commutative, one has (xy)? = x? y? for any 
x,y € D—that is the pth power mapping preserves multiplication. It remains to 
show that for any x, y € D, 

(x + y)P = xP $y? (7.22) 


Since we are in a commutative ring, the “binomial theorem” (Theorem 7.1.2) holds: 


thus: 
k=p 7 
Pp k. p—-k { P 
(+y)? = Di gy @ 
Using the fact that p is a prime number, one can show that each combinatorial 


number 
) = pl/(k(p — b)! 


is a multiple of p whenever 0 < k < p.!°© The result then follows. 


Finite Integral Domains 


We shall conclude this subsection with a result that is not only interesting in its own 
right, but whose proof involves one of the most important “counting arguments” in 
all of mathematics. This is the so-called “Pigeon Hole Principle,’ which says simply 
that any injective mapping of a finite set into itself must also be surjective. 


Theorem 7.4.7 Let D be a finite integral domain. Then D is a field. 


Proof Let d be a nonzero element of D; we shall show that d has a multiplicative 
inverse. Note that right multiplication by d defines an injective mapping D — D. 


'6Tn fact this argument is a special case of the same argument used in Wielandt’s proof of the Sylow 
Theorems (Theorem 4.3.3, p. 117). 
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For if ad = bd then a = b by the Cancellation Law (see Lemma 7.4.4). By the 
Pigeon Hole Principle, this mapping must be surjective and so there must exist an 
element c € D such that cd = 1. This says that c is the multiplicative inverse of d 
and the proof is complete. 


7.5 Exercises 


7.5.1 Warmup Exercises for Sect. 7.1 


1. Give a formal proof of Lemma 7.1.1 using only the axioms of a ring and known 
consequences (e.g. relevant Lemmata in Chap.3) of the fact that (R, +) is a 
commutative group. 

2. Do the even integers form a ring? Explain your answer. 

3. Let K be one of the familiar fields, Q, R, or C, or let K be the ring of integers 
Z. Let M2(K) be the collection of all 2-by-2 matrices with entries in K. Define 
addition and multiplication by the standard formulae 


ay by 4 abo\  (atah+bh2 
ci dy 2d.) \citadi+a 


a, by (2 by\ — (aa. + bic2 ayb2 + bidy 
c) dy cod.) \ cyan + dco cjbo + djdo )* 
(a) Show that M>(K) is aring with respect to matrix addition and multiplication. 


(b) Explicitly demonstrate that M(K) is not a commutative ring. 
(c) Show that the subset 


T(K) = ie ‘) inhae K| 


is a subring. 
(d) Is the set 


Ob 
Ty(K) = ica) Ide «| 
a subring of M2(K)? 


(e) In the three cases where K is a field, is M2(K) a division ring? 
(f) Now suppose K = Z, the ring of integers. 


(i) Let 
A= {(27) a. b.erd eZ, ciseven } 
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Is A a subring of M2(Z)? 
(ii) Let 


Is B asubring of Mz(Z)? 


. Consider the subset S of the field Q of rational numbers consisting of all fractions 


a/b, where the non-zero integer b is not divisible by 5. Is S a subring of Q? 
Explain your answer. 


. Let R be the ring Z/(12) of integer residues classes modulo 12. List the units of 


R and write out a multiplication table for the multiplicative group that they form. 
Can you identify this multiplicative group as one that you have met before? 


. Let B be any non-empty subset of a ring R. Show explicitly that the centralizer 


Cr(B) is a subring of R. [Hint: Use the three-point criterion of p. 193.] 


. (The Binomial Theorem for Commutative Rings) Let R be any commutative ring, 


let x and y be two elements of R, and let n be a positive integer. Show that 


k=p n 

n _ k.n—k 

ot y= aby (TY, 
where as usual 


@ = nl/(kin — k))), 


denotes the combinatorial number. [Hint: Use induction on n and the usual recur- 
sion identities for the combinatorial numbers. | 


7.5.2 Exercises for Sect. 7.2 


1s 


Ww 


Suppose a : R — S is a surjective homomorphism of rings. Suppose T is a 
subring of S. Is the set 


a!(T) := {r € R\a(r) € T} 


a subring of R? (Be aware that subrings must contain their own multiplicative 
identity element.) 


. Let R be a commutative ring and let P be an ideal of R. Prove that P is a prime 


ideal if and only if whenever a, b € R withab € P thena € Porbe P. 


. Prove Lemma 7.2.7. 
. In a commutative ring, a non-zero element a is said to be a zero divisor if and 


only if there exists a non-zero element b such that ab = 0. Thus an integral 
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domain could be characterized as a non-zero commutative ring with no zero 
divisors. 

Show that if R is a commutative non-zero ring with only finitely many zero 
divisors, then R is either an integral domain or is a finite ring. [Hint: For any 
zero divisor a, consider the surjective homomorphism ¢ : (R, +) > (aR, +) 
of additive abelian groups, defined by r +> ar, for all r € R. Then note that 
every non-zero element of aR and every non-zero element of ker ¢ is a zero 
divisor. Apply the group isomorphism R/ker@ ~ (aR, +) to deduce that ker 
has finite index as a subgroup of (R, +). (Explain why the homomorphism @ 
need not be a ring homomorphism. )] 


. Let R be aring and let J, J be ideals of R. We say that J, J are relatively prime 


(or are comaximal) if I + J = R. Prove that if J, J are relatively prime ideals 
of R,then/J/+JI2=10/J. 


. (a) Prove the Chinese Remainder Theorem: Let R be a ring and let J, J be 


relatively prime ideals of R. Then the ring homomorphism R > R/I x R/J 
given byr +> (r+ /,r-+ J) determines an isomorphism 


R/INJ) & R/T x R/S. 


(b) More generally, if [,, I,..., I; CG R are pairwise relatively prime, then the 
ring homomorphism R > R/T) x R/nx---x R/T, re rthrt+ 
In, ...,7 +],) determines an isomorphism 


R/(QAbN-:-Ak) = R/T x R/In x-++ x R/Tg. 


[Hint: For the homomorphism in part (a), the kernel is easy to identify. The 
issue is Showing its surjectivity. For this use 7+ J = R to show thatOx (R/J) 
and (R/JI) x 0 are both contained in the image of the homomorphism. Part 
(b), cries out for an induction proof. ] 


. Let R be aring and let J C R be an ideal. Show that if 7 C P; UP) U---U P,, 


where P|, P2,... P, are prime ideals, then J C P; for some index j. [Hint: for 
eachi = 1,2,...,r, letx; e 7—(P)} UP) U---U Py_-1 U Pig U--- U PP). 
Show that x; + x2x3---x-; ¢ Py) U---UP;.] 


. Recall the definition of residual quotient, (B : A) := {r € R|Ar C B}. Let R 


be aring and let J, J and K be right ideals of R. Prove that (7: J): K) = (CU: 
JK). (Here, following the custom of the ring-theorists, the symbol J K denotes 
the additive group generated by the set product J K. Of course it is a right ideal 
of R.) 


. Let R be a commutative ring and let J be an ideal of R. Define the radical of I 


to be the set 
VI = {r€ R|r™ €1 for some positive integer m}. 


Show that./7 is an ideal of R containing /. 
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10. 


11. 


12. 


13. 


14. 


15. 


Let R be a commutative ring, and let Q C R be an ideal. We say that Q is 
primary if ab € Q anda ¢ Q implies that b” € Q for some positive integer n. 
Prove the following for the primary ideal Q C R: 


(a) P := JO is a prime ideal containing Q. In fact P is the smallest prime 
ideal containing Q. (In this case we call Q a P-primary ideal.) 

(b) If Q isa P-primary ideal, ab € Q, anda ¢ P, thenbe Q. 

(c) If Q is a P-primary ideal and /, J are ideals of R with JJ C Q,/7 € P, 
then J CQ. 

(d) If Q is a P-primary ideal and if J is anideal J Z P, then(Q:/)=@Q. 


Suppose that P and @Q are ideals of the commutative ring R satisfying the fol- 
lowing: 


(a) P2Q. 
(b) If x € P then for some positive integer n, x” € Q. 
(c) Ifab € Qanda ¢ P, thenbe Q. 


Prove that Q is a P-primary ideal. 

Assume that Q;, Qo,..., Q, are all P-primary ideals of the commutative ring 
R. Show that Q;} 1 Q2M...M Q; is a P-primary ideal. 

Let R be acommutative ring and let Q C R be an ideal. Prove that Q is a primary 
ideal if and only if the only zero divisors of R/Q are nilpotent elements. (An 
element r of a ring is called nilpotent if r” = 0 for some positive integer n.) 
Consider the ideal J = nZ[x] + xZ[x] C Z[x], where n € Zis a fixed integer. 
Prove that / is a maximal ideal of Z[x] if and only if 7 is a prime. 

If R is a commutative ring and x € (){M| M is a maximal ideal}, show that 
1+ x €U(R), the group of units of R. 


7.5.3 Exercises for Sect. 7.3 


1. 


Let R be a ring and let M be a monoid with the following property: 


(F) Each element m possesses only finitely many distinct factorizations m,m2 
in M. 


Define the completion, (RM)* of the monoid ring RM to be the set of all 
functions f : M — R with point-wise addition and convolution multiplication. 
(The point is that we are no longer limited to functions of finite support.) Show 
that one still obtains a ring and that the usual monoid ring RM occurs as a subring 
of (RM)*. 


. The completion of the polynomial ring R[x] is usually denoted R[[x]]. Note that 


R[[x]] can be viewed as the ring of “power series” in the indeterminate x. Show 
that while the only units of R[x] are in fact the units of R, R[[x]] has far more 
units. In particular, note that 
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jag)! SHltxtx? te = pox. 
i=0 
ied) i 
Show that, in fact, f(x) = ajx' isaunit of R[[x]] if and only if ao is a unit of 
i=0 


R. [Hint: Show that if f (x) - g(x) = 1, where f(x) = ©) ajx!, g(x) = Bi x', 
aj, bi € R, then agbo = 1 so ao is a unit. 

On the other hand if ag is a unit in R, one can set by = ag ! , and show inductively 
that there exist solutions b; to 


n 
fri =0 


in which each 5; is a Z-linear combination of monomial words in aj " and 
(41; @2;4545Gn}.] 


. Let R bearing, and let M be the free commutative monoid on the two-element set 


{x, y}. Thus M = {xl y/ |i, 7 € N}, where, as usual, we employ the convention 
that when the natural number zero appears as an exponent, one reads this as the 
monoid identity element. (Note that M can also be identified with the monoid of 
multisets over {x, y}) The monoid algebra RM is usually denoted by R[x, y] := 
RM and is called a polynomial ring in two commuting indeterminates. Elements 
can be written as finite sums of the form 


S, y) = », ajjx'y!, aij ER. 


i,j20 


On the other hand, the polynomial ring construction can clearly be iterated, 
giving the polynomial ring R[x]Ly] := (R[x]) Ly]. Show that these rings are 
isomorphic. 

(a) Suppose M and N are monoids with identity elements lj and 1, (the 
monoid operation in both cases is multiplication and is denoted by juxta- 
position of its arguments). A monoid homomorphism from M to N is a 
mapping ¢ : M — N with the property that ¢(1y) = 1y and @(mym2) = 
o(m1)d(mz2) for all elements m,,m2 € M. Fix a ring R and form the 
monoid rings RM and RN. Consider the mapping p(?) : RM — RN 
which is defined by 


lm + ™ F> > Y, m 
Donen nem MPU) 


where rj, € R and as usual, the elements r,, are non-zero for only finitely 
many m’s. Show that (@) is a ring homomorphism. Show that p(@) is one- 
to-one (onto) if and only if @ is one-to-one (onto). In particular, if @ is an 
automorphism of M, then p(@) is an automorphism of RM. 
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(b) Similarly we may define an antiautomorphism of the monoid M to be a 
bijection T : M — M such that T(m m2) = T(m2)T(m})). Prove that 7 
must fix the identity element | yy. In addition, let a be an antiautomorphism 
of the ring R. Let the mapping p(a, 7): RM — RM be defined by 


oO ‘a 
Demesupps tm = 2 iy a 


Show that p(c, 7) is an antiautomorphism of the monoid ring RM. 

Let rev: M(X) — M(X) be the antiautomorphism of the free monoid on 
X which rewrites each word with the letters appearing in reverse order. Thus 
TEV(X1X2 +++ Xy) = XpXy—1 +++ X2X1. Show that if R is a commutative ring, 
then the mapping 3: R{X} — R{X} defined by 


ryw—> Tw rev(w), 
Ze w ae paevns) 


is an antiautomorphism of the ring R{X}. (Here as usual, the 7, are ring 

elements which are non-zero for only finitely many words w.) 

Let ( be the antiautomorphism of the preceding part of this exercise. Show 

that the subset Cr;x}(3) of polynomials left fixed by @ are the R-linear 

combinations of the palindromic words, and, if |X| > 1, that this set is not 
closed under multiplication. 

5. Suppose X is a finite set, and R[X] is the ring of polynomials whose commuting 
indeterminates are the elements of X. We suppose B is a ring which is either 
the zero ring or contains R in its center. Suppose ¢ : X — B is any mapping 
into the commutative ring B. As in Theorem 7.3.3, we define the mapping d : 
R[X] — B which maps every polynomial >", <y 4mm (with only finitely many 
of the coefficients a, in R nonzero) to the element >”,,, dn@(m). This mapping 
simply substitutes @(x;) for x; in each polynomial in R[X] and x; € X Show 
that b possesses the following properties: 


(c 


wm 


(d 


w 


(by + bz) = G(b1) + G(b2) 
pbib2) = (bi) p(b2) 
(rb) = rd(b) 
for any b.b,, bz in B and any r in R. 


6. Leta: X — B C Cs(R) be a mapping from a set X into a set B which is 
contained in the centralizer, in ring S, of a subring R. 


(a) Show that the corresponding evaluation mapping 
E*: R{X}—> S 


is a ring homomorphism. (See p. 204 for definitions.) 
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(b) (Uniqueness) If we regard X as the set of words of length one in the free 
monoid on X, show that E* is the unique ring homomorphism h : R{X} > 
S which extends the given mapping a: X — B. [Show that h(p) = E*(p) 
for each polynomial p.] 


Let X = {x1,..., X,} be any non-empty finite set. Recall that M(X) denotes the 
free monoid on the set X—that is, the set of all “words” of finite length in the 
alphabet X with the concatenation operation. Also one may recall that M(X) is 
the additive monoid of multisets of ¥—that is, sequences (a1, ..., d,) of non- 
negative integers. Define the “letter inventory mapping” k : M(X) > M(X) 
where, for a word w, 

K(w) = (a1, ..-, dn) 


if and only if, for each 7, the letter x; occurs exactly a; times in the word w. Show 
that « is a monoid homomorphism (as defined in the first part of Exercise 4). 
From the same exercise, conclude that for any commutative ring R, the mapping 
« extends to an R-homomorphism of rings R{X} > R[X]. 


. Let R be a subring of ring S and let B be a set of pairwise commuting elements 


of Cs(R). Suppose a : X — B is a mapping of set X into B. 


(a) Show that the mapping Eq : R[X] — S defined on p. 204 is a homomor- 
phism of rings. 

(b) (Uniqueness) Show that if X is regarded as a subset of R[X] (namely as the 
set of those monomials which are words of length one), then for any ring 
homomorphism v : R[X] — S which extends a : X — B, one must have 
v= Ee, 


. Again let a : X — B be a mapping of set X into B, a collection of pairwise 


commuting elements of the centralizer in ring S of its subring R. Let « be the 
monoid homomorphism of Exercise 7 and let 


p(k): R{X} = RM(X) > RM(X) = R[X] 
be its extension to the monoid rings as in Exercise 4. Show that 
Et = Ee, 0 pth). 


[Hint: Use the uniqueness of E*.] 

Let n be a positive integer. Suppose A = (a,j) is ann x n matrix with entries 
drawn from the ring R. For j € {1,..., n}, let e; be the n-tuple of R™ whose 
jth entry is 1 € R, the multiplicative identity of R, and all of whose remaining 
entries are equal to 0 € R. The matrix A is said to be R-invertible if and only if 
each e; is an R-linear combination of the rows of A. Let X = {x1,..., Xn}, and 
let 

a(A): R[X] > R[X] 
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11. 


12. 


which replaces each polynomial p(x, ..., Xn) by the polynomial 


r(> ayiXj, > AQiXj,---, > anixi) 


the parameter / in each sum ranging from | to n. Show that if A is R-invertible, 
then o(A) is an automorphism of R[X]. [Hint: Use the evaluation homomor- 
phism £4 for an appropriate triplet (a, B, S). Then show that it is onto.] 

Let G bea finite group, let F bea field, and let x : G > F* beahomomorphism 
of G into the multiplicative group F* := F\{0}. Define the element «, = 


> x7 g € FG and show that €> = |Gle,. 
geG 
Same assumptions as the above exercise. Show that the element €,, commutes 


with every element of F'G by showing that for all g € G, gey = x(g)ey. 


7.5.4 Exercises for Sect. 7.4 


. Let {Ro}cer be a family of rings and let P = [] cer Ro be the direct product of 


these rings. If K is any subset of the index set J, set Pk := {f € P| ff) = 
0 for all o ¢ K}. Show that Px forms a (two-sided) ideal of P. 


. Let X bea set and let R be aring. In Example 38 we defined the ring of functions 


R*. Prove that 


aS [es 


xeX 


where R, ~ R foreach x € X. 


. Prove that any non-zero subring of an integral domain contains the multiplicative 


identity element of the domain. [Hint: use Lemma 7.4.4.] 


. Prove that the polynomial ring Z[x] is not a principal ideal domain by showing 


that the ideal J := 2Z[x] + xZ[x] is not a principal ideal. 


. Let d be a positive integer congruent to —1 modulo 4 and consider the integral 


domain 


D:= {a+ b/—d| a,b € Z} 


whose operations are inherited as a subring of the complex number field C. Show 
that the ideal J := 2D + (1 + /—d)D is not a principal ideal. 


. Suppose w = e77/3 = (—1 — /—3)/2, so w is a complex number satisfying 


the identity: w* = —w — 1. On p. 189 and again on p. 218 we introduced the 
Eisenstein numbers, the integral domain Z[w]. Show that the mapping Z[w] > 
Zw] defined by 

atbuwra+bu’,a,beZ 


is an automorphism of this ring. 
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7. Lett : Mo(C) — M?(C) be the composition of the transpose antiautomorphism, 
T, and the automorphism m(c) induced by complex conjugation (c : C > C) 
of the matrix entries. (See Example 41 for notation and the fact thatt = coT = 
T oc.) Let H = {h(a, 2)|(a, 3) € C x C} be the ring of real quaternions viewed 
as a subring of M2). (See p. 215.) 


(a) Show that as an antiautomorphism of M2(C), that 7? is the identity 
automorphism—that is, 7 is an involution. 

(b) Show that H is invariant under this antiautomorphism. 

(c) Establish that h(a, 3)" = h(a®, —G) and conclude that the centralizer in H 
of the antiautomorphism 7 (that is, its fixed points) is the center of H, the 
set RIp = {h(r, 0)|r € R}, the real multiples of the 2-by-2 identity matrix. 

(d) Show that for any h € H, D(h) := hh’ is a “real” element—that is an 
element of the center Z(H). [Hint: Show that hh’ is fixed by 7.] 

(e) Compute that D(h(a, 3)) = N(a) + N(@) = det (h(a, f)). 

(f) Show that D(hjh2) = D(h,)Dhz), for all h,, hz € H. [Hint: Use the fact 
that 7 is an antiautomorphism of H and that D(H) is in the center of H.] 

(g) Let 

H|z := {h(a, 2)|a, G Gaussian integers}. 


Show that Hz is a subring of H which is also invariant under the antiauto- 
morphism 7. [Hint: Just check the rules for multiplying and applying 7 for 
the quaternionic numbers h(a, /3).] 

(h) Using the previous two steps, prove that the set of integers which are the 
sum of four perfect squares, is closed under multiplication. (This set turns 
out to include all natural numbers.) 
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Chapter 8 
Elementary Properties of Modules 


Abstract One way to study rings is through R-modules, which (in some possi- 
bly non-faithful way) serve to “represent” the ring. But such a view is restrictive. 
With R-modules one enjoys an increased generality. Any property possessed by an 
R-module can conceivably apply to the ring R itself, as a module over itself. Also, 
any universal property gains a greater strength, when one enlarges the ambient realm 
in which the property is stated—in this case from rings R, to their R-modules. This 
chapter still sticks to basics: homomorphisms, submodules, direct sums and products 
and free R-modules. Beyond that, chain conditions on the poset of submodules can 
be seen to have important consequences in two areas: endomorphisms of modules, 
and their generation. The former yields the Krull-Remak-Schmidt Theorem con- 
cerning the uniqueness of direct decompositions into indecomposable submodules, 
while, for the latter, the ascending chain condition is connected with finite generation 
via Noether’s Theorem. (The existence of rings of integral elements is derived from 
this theorem.) The last sections of the chapter introduce exact sequences, projec- 
tive and injective modules, and mapping properties of Hom(M, N)—a hint of the 
category-theoretic point of view to be unfolded at the beginning of Chap. 13. 


8.1 Basic Theory 


8.1.1 Modules over Rings 


Fix aring R. A left module over R or left R-module is an additive group M = (M, +), 
which admits a composition (or scalar multiplication) 


RxM—->M 


(which we denote as a juxtaposition of arguments, as we do for multiplication in R) 
such that for all elements a,b € R, m,n eé M, 


a(m+n)=am-+an (8.1) 
(a+b)m =am+bm (8.2) 
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(ab)m = a(bm) (8.3) 
l-m=m, (8.4) 


where | denotes the multiplicative identity element of R. It follows from 0r +0r = 
Or that for any element m of M, Orpm + Orm = Opm. Thus Orm = Oy for any 
m € M. Here, of course, Or denotes the additive identity of the additive group (R, +) 
while Oy denotes the additive identity of the additive group (M, +). 

A right R-module is an additive group M = (M, +) admitting a right composition 
Mx R — M, such that, subject to the same conventions of denoting this composition 
by juxtaposition, we have for alla,b € R, m,ne M, 


(m+n)a = ma-+na (8.5) 
m(a+b)=ma+mb (8.6) 
m(ab) = (ma)b (8.7) 
ml=m. (8.8) 


Again we can easily deduce the identities 


mOr = Om (8.9) 
m(—a) = —(ma) (8.10) 


where Or and Oy are the additive identity elements of the groups (R, +) and (M, +), 
and where m is an arbitrary element of M. 

Notice that for a left R-module, M, the set {r € R|rM = O} is a 2-sided ideal 
of R. It is called the left annihilator of M. Similarly, for a right R-module M the 
set {r € R|Mr = O} is also a 2-sided ideal, called the right annihilator of M. In 
either case (left module or right), the module is said to be faithful if and only if the 
corresponding annihilating ideal is the 0-ideal. (See Exercise (3) in Sect. 8.5.1.) 


8.1.2 The Connection Between R-Modules and 
Endomorphism Rings of Additive Groups 


Consider first a left R-module M. For each element r € R, the mapping p(r) : 
(M,+) — (M, +) defined by m + rm, is an endomorphism of the additive group 
(M, +). (This is a consequence of Eq. (8.1).) As remarked in Example 41 of Chap. 7, 
the set End(M) of all endomorphisms of the additive group (VM, +) forms a ring 
under point-wise addition of endomorphisms and composition of endomorphisms. 
Now for any elements r,s in the ring R and m e€ M, Eq.(8.3) yields 
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p(rs)m = (rs)m = r(sm) = p(r)[p(s)(m)] = (p(r) © p(s))(m). 


Thus 
p(rs) = p(r)° p(s). (8.11) 


Similarly, Eq. (8.2) yields 
pr +s) = pr) + p(s). (8.12) 


These equations tell us that p : R — End(M) is a homomorphism of rings. 

Notice that ker p = {r € R|rM = O}, is the left annihilator of M as defined in 
the previous subsection. 

Conversely, if we are given such a ring homomorphism p : R — End(M), where 
M is an additive group, then M acquires the structure of a left R-module by setting 
rm := p(r)(m), forallr e R, me M. 

In the same vein, aright R-module structure on the abelian group (M, +) is equiv- 
alent to specifying a ring homomorphism R — Opp(R)(End(/)) Here Opp(R) is 
the additive group of all endomorphisms of M endowed with a multiplication o, 
where, for any endomorphisms a and (3, 


009 6 = Boa, 


the endomorphism obtained by first applying a and then applying ( in that chrono- 
logical order. (See Example 40.) 


Remark At times one might wish to convert a left R-module to a right S-module, or 
the reverse. We have the following constructions: 


Suppose R and S are rings whose multiplicative identity elements are denoted er 
and és, respectively. 


1. Let M be a left R-module. Suppose 7) : S — Opp(R) is a ring homomorphism 
for which y)(es) = er. For eachm € M, ands € S define 


ms := wW(s)m. 


Since w(s) € R, and M is a left R-module, the right side is a well-defined element 
of M. With multiplication M x S — M in this way, it is easy to show that M 
becomes a right S module. 

2. Similarly if M is a right S module, then, from a ring homomorphism @ : R > 
Opp(S) for which ¢(er) = es, one may convert M into a left R-module. 


Proof of the appropriate module axioms is requested in Exercises in Sect. 8.5.2. 
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8.1.3 Bimodules 


Finally, there is a third type of module, an (R,S)-bimodule. Here M is a left 
R-module and at the same time a right S-module, and the left and right 
compositions are connected by the law 


r(ms) = (rm)s, forall (r,m,s)€ Rx MxS. (8.13) 


Although this resembles a sort of “associative law”, what it really says is that every 
endomorphism of M induced by right multiplication by an element of ring S com- 
mutes with every endomorphism induced by left multiplication by an element of ring 
R. We shall consider bimodules in more detail later, especially in our discussions of 
tensor products (see Sect. 13.3). 

Here are some examples of bimodules. 


Example I Let T be a ring and let R and S be subrings of T, each containing the 
multiplicative identity element ey of T. Then (T, +) is an additive group, which 
admits a composition T x S — T. It follows directly from the ring axioms, that with 
respect to this composition, T becomes a right S module. We denote this module by 
the symbol 7; when we wish to view T in this way. 

Also, (7, +) admits a composition R x T — T withrespect to which T becomes 
a left R-module which we denote by rT. 

Finally (7, +) admits a composition 


SxTxR-oT 


inherited directly from ring multiplication. With respect to this composition T 
becomes an (R, S)-bimodule, which we denote by the symbol rTs. Notice that 
here, the special law for a bimodule (Eq. 8.13) really is the associative law of T 
(restricted to certain triples, of course). (Note that in this example, T can be replaced 
by any 2-sided ideal J, to form a (R, S)-bimodule pr /s.) 


Example 2 If R is a commutative ring, and if M is aright R-module, we may define 
a left R-module structure on M by defining r-m := mr, r € R, m € M. Note that 
the commutativity of R guarantees the validity of condition (8.3) since 


r(ms) = r(sm) = (rs)m = m(rs)(mr)s = (rm)s. (8.14) 


Thus we see that M is also an (R, R)-bimodule. Bimodules constructed in this way 
have a role to play in defining multiple tensor products and the tensor algebra (pp. 
481-493), and for this reason we give them a special name: symmetric bimodules. 
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Example 3 Here is a simple variation on the preceding example. Let o be an auto- 
morphism of the commutative ring R. Then a right R-module M can be converted 
into an (R, R)-bimodule by declaring that mr = r°m for all (m,r) € M x R. 
(One simply emulates the sequence of Eq. (8.14) applying o or its inverse when ring 
elements pass to the left or right over module elements.) 


Remark In these two examples, the commutativity of R is more or less forced. For 
suppose M is an arbitrary right R-module and a is automorphism of the additive group 
(R, +). If we attempt to convert M into an (R, R)-bimodule by the rule mr = r°m 
for all (m,r) € M x R, three conclusions are immediate: (i) o is easily seen to be 
an antiautomorphism of R (that is, (rs)° = s°r7), (11) o takes the right annihilator 
of Mr, that is, the two-sided ideal J = Annr(Mpr) := {r € R|Mr = 0}, to the left 
annihilator 77 = Annr(ry), and (iii) that the factor ring R/J is commutative. Thus 
if J = I7, we have recovered Example 3 above with R replaced by R’ := R/T. 


8.1.4 Submodules 


Fix a (right) R-module M. A submodule is an additive subgroup of (M, +) that is 
closed under all scalar multiplications by elements of R. 

It is easy to see that a subset N is a submodule of the right R-module M, if and 
only if 


(i) N+N=N 
(i) —N = N, and 
Gili) NRC N. 


It should be remarked that the last containment is actually the equality NR = N, since 
right multiplication by the identity element | induces the identity endomorphism 
of N. 
Suppose now V = {N,|o € 1} is a family of submodules of the right R-module 
M. The intersection 
al 


oel 


of all these submodules is clearly a submodule of M. This is the unique supremum 
of the poset of all submodules contained in all of the submodules in NV. 

Now let X be an arbitrary subset of the right R-module M. The submodule 
generated by X is defined to be the intersection of all submodules of M which contain 
the subset X and is denoted (X)r. From the previous paragraph, this intersection is 
the unique smallest submodule in the poset of all submodules of M which contain X. 

This object has another description. Let X be the set of all elements of M which 
are finite (right) R-linear combinations of elements from X—that is elements of the 
form xjr] +--++-+Xnrn where n is any natural number, the x; belong to X and the r; 
belong to the ring R. Then X is a subgroup (recall that x(—1z) = —x) and XR = X, 
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so X is a submodule. On the other hand, it must lie in any submodule which contains 
X and so must coincide with (X)pr. 
In the case that X = {x,,..., x,} is a finite set, 


X=xR+---+x,R. 


A right R-module is said to be finitely generated if and only if it is generated by a 
finite set. Thus right R-module N is finitely generated if and only if 


N=xjR+---+2%y,R 


for some finite subset {x1, ...x,} of its elements. 

Now once again let V = {N,|o € I} be a family of submodules of the right 
R-module M.The sum over N is the submodule of M generated by the set-theoretic 
union U{N|N € \’} of all the submodules in the family \V and is denoted >° cet No- 
It is therefore the set of all finite R-linear combinations of elements which lie in 
at least one of the submodules in VV. But of course, as each submodule is invariant 
under right multiplication by elements of R, this simply consists of all finite sums of 
elements lying in at least one module of the family. In any event, this is the unique 
member of the set of all submodules of M which contain every member of \V. 

We summarize this information in 


Lemma 8.1.1 Let R be aring, and let M be aleft R-module. The poset of all submod- 
ules of M (with containment as the partial order) is a lattice with unrestricted meets 
and joins, given respectively by the intersection and sum operations on arbitrary 
families of submodules. 


Obviously, a corresponding lemma holds for the poset of submodules of a given 
left module over some ring R. 


8.1.5 Factor Modules 


Now suppose WN is a submodule of the right R-module M. Since N is a subgroup 
of the additive group (VM, +), we may form the factor group M/N whose elements 
are the additive cosets x + N, x € M of N. Next, define an R-scalar multiplication 
M/N x R — MIN by setting (x + N)r := xr + N. That this is well defined is 
an immediate consequence of the fact that Nr C N for each r € R. Showing the 
remaining requisite properties (8.5)—(8.8) is equally routine, completing the defini- 
tion of the quotient right R-module M/N. Again, this construction can be duplicated 
for left R-modules: For each submodule N of a left R-module M, we obtain a left 
R-module M/N. 
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8.1.6 The Fundamental Homomorphism Theorems 
for R-Modules 


Suppose M and WN are right R-modules (for the same ring R). An R-module homo- 
morphism M — N isahomomorphism f : (M, +) — (N, +) between the under- 
lying additive groups, such that 


f(m)r = f(mr), (8.15) 


for allm € M andallr ¢€ R. In other words, these are group homomorphisms which 
commute with the right multiplications by each of the elements of the ring R. 

It follows from the definition that the composition 3 o a of two module homo- 
morphisms a: A > B and @: B > C, is an R-homomorphism A > C. 

Before going further, we will find it convenient to adopt a common visual- 
linguistic device for discussing compositions of homomorphisms. Suppose a : A > 
B,@:B-— Candy: A — C are homomorphisms. This data is assembled in the 
following diagram: 


B 


We say that the “diagram commutes” if and only if y = ( 0 a, the composition 
of a and @. 
We use similar language for square diagrams: 


Ae 


a| |. 
B 
Cc —> D 
Again, such a diagram “commutes” if and only if doy = Goa. 

The adjectives which accrue to an R-homomorphism are exactly those which are 
applicable to it as a homomorphism of additive groups: Thus the R-homomorphism 
f is 
e an R-epimorphism 
e an R-monomorphism 
e an R-isomorphism 
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if and only if f is an epimorphism, monomorphism, or isomorphism, respectively, 
of the underlying additive groups. 

We define the kernel of an R-homomorphism f : M — WN among right 
R-modules, simply to be the kernel of f as a homomorphism of additive groups. 
Explicitly, ker f is the set of all elements of M which are mapped to the addi- 
tive identity Oy of the additive group N. But the equation OvyR = Oy shows 
that if x lies in ker f, then the entire set xR must be contained in ker f since 
FR) = f(x)R = OyR = Oy. Thus the kernel of an R-homomorphism must 
be an R-submodule. At the same time, the image, f(/) C N of the homomorphism 
is easily checked to be a submodule of N. Note finally that if K C M is a submodule 
of M, then there exists a surjective homomorphism 7 : M — M/K, defined by set- 
ting 7(m) := m+ K € M/K, me M. Thisis called the projection homomorphism 
of M onto M/K. 

The student should be able to restate the obvious analogues of the above definitions 
for left R-modules as well. 

Now we can state 


Theorem 8.1.2 (The Fundamental Theorem of Module Homomorphisms) Suppose 
f :M => N is ahomomorphism of right (left) R-modules. 


(i) If K C ker f is a submodule of M, then the mapping f : M/K — N defined 
by setting f(m+ K) := f(m) is a well-defined module homomorphism making 
the diagram below commute: 


f 
MN 
M/K 


(ii) If K =ker f, then the homomorphism f : M/K —> f (M) is an isomorphism. 


Proof We have already seen that the mapping f : M/K — N is a well-defined 
homomorphism of abelian groups. However, as 


F(x+K)r) = far) = f(x)r = (fet K))r, 


we infer that f is also an homomorphism of R-modules, proving part (i). Note that if 
K = ker f, then we again recall that the induced homomorphism f : M/K — f(M) 
is an isomorphism of abelian groups. From part (i) f is an R-module homomorphism, 
proving part (ii). 


Theorem 8.1.3 (i) (The Correspondence Theorem for R-modules.) If f : M => 
N is an R-epimorphism of right (left) R-modules, then there is a poset isomor- 
phism between S(M, ker f), the poset of all submodules of M which contain 
ker f and the poset S(N, On) of all submodules of N. 
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(ii) (The Composition Theorem for R-modules.) Suppose A < B < M is achain 
in the poset of submodules of a right (left) R-module M. Then there is an 
R-isomorphism 

M/A — (M/B)/(B/A) 


Proof If N’ € S(N, On), then f~!(N’) € S(M, ker f) and ff~!(N’) = N’. Onthe 
other hand, if M’ € S(M, ker f) and if M” = f—! f(M"), then f~! f(M”) D M". 
However, if there exists an element m € f~! f(M”)\M", then f(m) = f(m") 
for some m” € M”. But then, m — m” ¢€ ker f, forcing m — m” € M” and so 
m € M",a contradiction. Therefore the homomorphism f establishes a bijection 
S(M, ker f) > S(N, Ow), proving part (i). 

For part (ii), note that the projection homomorphism p : M — M/A contains 
the submodule B in its kernel. Therefore, the induced mapping M/B —. MIA is 
a well-defined surjective R-module homomorphism. However, since it is clear that 
ker p = A/B, we may apply Theorem 8.1.2 to infer that (M/B)/(A/B) = M/A, 
proving part (ii). 


Theorem 8.1.4 (The Modularity Theorem for R-modules.) Suppose A and B are 
submodules of the right (left) R-module M. Then there is an R-isomorphism 


(A+ B)/A— B/(BN A). 
Proof We have the composite homomorphism 
Bes A+B (A+ Bi/A, 
whose kernel is obviously BM A. Since every element of (A + B)/A is of the form 


(a+b)+A=b+A,ae€eA,b€ B, we infer that the above homomorphism is 
also surjective. Now apply Theorem 8.1.3, part (ii). 


8.1.7 The Jordan-Holder Theorem for R-Modules 


An R-module is said to be irreducible if and only if its only submodules are itself 
and the zero submodule—that is, it has no proper submodules. 

Let Irr(R) denote the collection of all isomorphism classes of non-zero irreducible 
R modules. Suppose M is a given R-module, and let P(M) be the lattice of all its 
submodules. Recall that an interval [A,B] in P(M) is the set {C € P(M)|A <C < 
B} (this is considered to be undefined if A is not a submodule of B). Therefore, in 
this context, a cover is an interval [A, B] such that A is a maximal proper submodule 
of B. Finally, recall that a non-empty interval [A, B] is said to be algebraic if and 
only if there exists a finite unrefinable chain of submodules from A to B. 
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Because of the Correspondence Theorem for R-modules (Theorem 8.1.3), we see 
that (A, B) is a cover if and only if B/A is a non-trivial irreducible R-module. Thus 
we have a well-defined mapping 


fu: Cov(P(M)) — Irr(R), 


from the set of all covers in the lattice P(M) to the collection of all non-zero irre- 
ducible R-modules. If A; and A2 are distinct proper submodules of a submodule B, 
with each A; maximal in B, then Aj + Az = B, and by the Modularity Theorem 
for R-modules (Theorem 8.1.4), A2/(A; Az) (being isomorphic to B/A;) is an 
irreducible R-module. Thus if {[A;, B], [A2, B]} € Cov(P(M)) and A; 4 Ao, then 
[A, M Ao, Aj], i = 1, 2 are also covers. This is the statement that 


e The poset P(M) of all submodules of the R-module M, is a semi-modular lattice. 


At this point the Jordan-Hélder Theorem (Theorem 2.5.2 of Chap.2) for semi- 
modular lower semilattices, implies the following: 


Theorem 8.1.5 (Jordan-Hélder Theorem for R-modules.) Let M be an arbitrary 
R-module, and let P(M) be its poset of submodules. 


(i) Then the mapping 
pu: Cov(P(M)) > Irr(R), 


extends to an interval measure 
pi: Alg(P(M)) > MUrr(R)), 


from the set of all algebraic intervals of P(M) to the monoid of multisets over 
the collection of all isomorphism classes of non-trivial irreducible R-modules. 
Specifically, the value of this measure j. at any algebraic interval [A, B], is 
the multiset which inventories the collection of irreducible modules Aj+1/Ai 
which appear for any finite unrefinable chain A = Ag < Aj <-:- < Ax =B 
according to their isomorphism classes. The multiset of isomorphism classes of 
irreducible modules is the same for any such chain. 

(ii) Suppose M itself possesses a composition series—that is, there exists a finite 
unrefinable properly ascending chain of submodules beginning at the zero sub- 
module and terminating at M. The composition factors are the finitely many 
non-trivial irreducible submodules formed from the factors between successive 
members of the chain. Then the collection of irreducible modules so obtained 
is the same (up to isomorphism) with the same multiplicities, no matter what 
composition series is used. In particular all composition series of M have the 
same length. 
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8.1.8 Direct Sums and Products of Modules 


Let J be any index set, and let {M,|o € I} be a family of right R-modules indexed 
by J. Recall that the direct product of the additive groups (M,, +) is the collection 
of all functions 

f:il7 U M, (disjoint union) 


oel 


subject to the condition that f(a) € M, for each o ¢€ I. This had the structure of an 
additive group [i M,j with addition defined “coordinatewise,” i.e., (f + g)(o’) = 
oel 

flo) +90’), f.g €[] Mo. o’ € I. Here, since each M, is an R-module, []Mo 
can be endowed with the structure of a right R-module, as follows. If f ¢ []Mo, and 
r € R, we define fr € []M, to be the function defined by fr(c) := f(c)r. The 
right R-module so defined is called the direct product over the family {M,| a € J} 
of R-modules. 

The direct sum of the family {M,|o € I} of R-modules is the submodule of 
[| Mc consisting of all functions f € [] M, such that f(c) 4 Oy, for only finitely 
many o € J. (This is easily verified to be a submodule of [] M..) 

The direct product [] M, comes equipped with a family of projection homomor- 
phisms 7," : [] Mz > Mg’, o’ € I, defined by setting 7,:(f) = f(a’) € Mor. 
Likewise the direct sum comes equipped with a family of coordinate injections 
Ha! : Mg > [] Mo, defined by the rule ,/(m) := for € [] Mo, where 


my ifo=o' 
(0) = 
For() f ifo #o'. 
The direct sum of the R-modules M,, o € I is denoted ‘ap M,; again, note that this 
oel 
is a submodule of the direct product I] M,. These projection and injection homo- 


morphisms satisfy certain “universal sondiligne! which are expounded in Sect. 8.4.1. 
We note that each (Mz) = M,’; furthermore it is routine to verify that 
@ M, = > Ho (Mz) cS I]. 
As was the case for abelian groups, when J is a finite set, say, 7 = {1,..., m}, the 
notions of direct product and direct sum coincide, and we write 


n n 
[| ™“ = QB mM; = M,O®M26:--® Mz. 


i=l i=l 


Let M be aright R-module and assume that {M, |o € J} isa family of submodules 
with M = >°__, M,. We have a natural surjective homomorphism p : QM, > 
> Mz via 


oel 
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Pf) = >) £0) € >) Mo. (8.16) 


oel oel 


Since f € QM, implies that f(c) 4 0 for only finitely many o € J, we see that 
the sum in Eq. (8.16) makes sense (being a finite sum). Again, the verification that 
p is a Surjective R-module homomorphism is entirely routine. When p is injective 
(and hence is an isomorphism), we say that M = >° M, is an internal direct sum 
of the submodules M,, o € J; when there is no danger of confusion, we shall write 


Ms = @ Moz. 


Theorem 8.1.6 (How To Recognize Direct Sums Internally) Let M be a right 
R-module and let {M, | 0 € I} be a family of submodules. Then M = @ M,, if and 
only if 


(i) M=>0 Mg, and 
oel 
(ii) foreacha € 1, M,N > M, = 0. 
v#o 


Proof Assuming conditions (1) and (ii) above, it suffices to show that the homomor- 
phism p : QM, > M = >) Ms defined above is injective. Thus, let f € QB Mo, 
and assume that p(f) = >> f(a) = 0. But then, for any o € 7, 


fO=-D FM EMIN >) M =0; 
vAo v#o 


since o € I was arbitrary, we conclude that f = 0, proving that p: QM, > Mis 
injective. 

Conversely, assume that p : G)M, — M@ is an isomorphism. Then, as 
P(BM,) = >) Moz, we conclude already that M = >) Mo. Next, if m € 
Men ey M,, we have elements mg € My’, o’ € I satisfying mz = — >> my. But 

v#C vA#o 
then, if we define f € QB M, by f(o’) := mag, o’ € I, we conclude that f € ker p, 
a contradiction. 


Hypothesis (ii) of the preceding Theorem 8.1.6 can be relaxed when the collection 
of potential summands is finite. 


Corollary 8.1.7 (Recognizing finite internal direct sums) Suppose {M\,..., Mn} 
is a collection of submodules of the right R-module M and suppose they “span” 
M—that is, M = >~?_,Mj. Then M ~ M, ®--- ® My if and only if, for each i, 
1 <i<n-—1, one has 


i 
M419 DM =0. 


The proof is left as Exercise (4) in Sect. 8.5.1. 
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8.1.9 Free Modules 


Let M be aright R-module, and let 6 be a subset of M. An R- linear combination 
of elements of 6 is any expression of the form » bry € M, where each rp € R and 


beB 
where only finitely many terms br, ¢ 0. We say that B spans M if each element of 
M can be expressed as an R-linearly combination of elements in B. In this case, it is 


clear that M = >. DR. 


beB 
The subset 6 C M is said to be R- linearly independent if the linear combination 
> brp = 0 if and only if each “scalar” ry = 0. Finally, we say that B is a basis of 
M if and only if 6 is R-linearly independent and spans M. 


Lemma 8.1.8 The following statements about a right R-module are equivalent: 


(i) M has a basis. 
(ii) There exists a subset B of M such that each element of M is uniquely expressible 
as an R-linear combination of elements of B. 
(iii) M is a direct sum of submodules isomorphic to Rp. 


Proof (i) implies (ii). By (i) M contains a basis B, an R-linearly independent span- 
ning set. If bir and >° ghisi were two R-linear combinations for the same ele- 
ment m € M, then their difference >° gi (ri — s;) would be an R-linear combination 
representing the module element 0. By the R-linear independence of 6, one has 
r; = s; for alli, so the R-linear combinations for m are the same. Thus (ii) holds. 
(ii) implies (i) Since we can always write 0 = >° phi - 0 the uniqueness of the 
R-linear combinations of elements of 6 representing 0, establishes that 6 is an 
R-linearly independent. Since B is presented as a spanning set, it is in fact a basis. 
(iii) implies (i1). Suppose / is a direct sum of modules isomorphic to Rr. Then for 
some set B, wecanregard M as the module of all functions f : B — Rwhere f(b) 4 
0 for only finitely many b € B, under point-wise addition and right multiplications 
by elements of R. For each such f let supp(f) = {b € B| f(b) 4 0}, a finite set. 
Let B be the collection of functions {e,|b € B} where e,(y) = 1p if y = b and is 
Or otherwise (here Or and 1p are the additive and multiplicative identity elements 
of the ring R). Then for any function f we may write f = Desuppc pe) f ()- 


In case f = 0 € M, each f(b) = 0. It follows that every element of the direct sum 
has a unique expression as an R-linear combination of the functions in 6. Thus (ii) 
holds. 

(i) Implies (iii) Now suppose the R-module M has a basis B. Suppose b € B. Then 
the uniqueness of expression means that for any r,s € R, br = bs impliesr = s. 
Thus the map which sends br to r is a well-defined isomorphism bR — Rp of right 
R-modules. The directness of the sum M = @p-gbR follows from the uniqueness 
of expression. Thus (iii) holds. 

The proof is complete. 
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We hasten to warn the reader that a given R-module need not have a basis (or 
even a nontrivial linearly independent set). One need only consider the case that M is 
not a faithful module. At the other extreme, if R is any ring, the corresponding right 
R-module M = Rp has the basis {1}, where | is the multiplicative identity element 
of R. (Note that, in fact, if u € R is any unit (=invertible element) of R, then {u} is 
a basis of R.) 

A free (left) right R-module is simply an R-module having a basis—or equiva- 
lently the other two properties of Lemma 8.1.8. 

The fundamental universal property of free modules is the following. 


Theorem 8.1.9 (The Universal Property of Free Modules) Let F be a free R-module 
with basis B C F. Let M be anarbitrary R-module, and let mp, b € B be arbitrary 
elements of M. Then there exists a unique R-module homomorphism 9 : F — M 
such that d(b) = mp, b € B. 


Proof The desired homomorphism ¢ : F — M will be defined in stages. First set 
o(b) = mp, for all b € B. Next we set 


(> on) = dS moro eM. (8.17) 


beB beB 


Note that Eq. 8.17 produces a well-defined mapping F — M, since any element of 
F has a unique representation as an R-linear combination of the basis elements in B. 

Next, it must be shown that this mapping obeys the fundamental properties of an 
R-homomorphism. Consider elements x, x’ € F, and ifr,r’ € R. Since B is a basis 
of F,, there are unique representations 


x= > dro, => br 


beB beB 


for suitable scalars rp, rp € R. Therefore, 


d(ar +x'r') = (> brpr + >. oor’) 


beB beB 


= (> b(rpr + ta) 
beB 
= » mp(rpr + rr’) 
beB 
— > Mprpr + = mpryr 
beB beB 


= O(x)r + O(x'r", 


8.1 Basic Theory 245 


proving that @ is, indeed, a homomorphism of right R-modules. This proves the 
result. O 


The following result is very nearly obvious, but important nonetheless. 


Theorem 8.1.10 Let M be an R-module. Then there exists a free R-module F and 
a surjective homomorphism €: F — M. 


Proof Form the direct sum F = PB Rm, Where each Ry ~ Rr. From the above, 


meM 
F is a free R-module having basis {e,, | m € M} where the elements e,, are defined, 


as usual, by requiring that 


1 ifm’ =m 
em (Mm) = : 
0 ifm’ Am. 
Also, using Theorem 8.1.9, we see that there is a (unique) R-module homomorphism 
€: F + M with e(e,) = m, m € M. Obviously, € is surjective, completing the 
proof. 


8.1.10 Vector Spaces 


Let D be a division ring. A right D-module M is called a right vector space over 
D. Similarly, a left D-module is a left vector space over D. 

The student should be able to convert everything that we define for a right vector 
space to the analogous notion for a left vector space. There is thus no need to go 
through the discussion twice, once for left vector spaces and once again for right 
vector spaces—so we will stick to right vector spaces for this discussion. 

The elements of the vector space are called vectors. We say that a vector v linearly 
depends on a set {x1,..., Xn} of vectors if there exist elements a1,...,Q, € D, 
such that 

v= xXQy tees + XnQn. 


Equivalently, v linearly depends on {x),..., x,} if and only if v is contained in the 

(right) submodule generated by x1, ... , x,. Clearly from this definition, the following 

properties hold: 

Reflexive Property. x; linearly depends on {x1,..., Xn}. 

Transitivity. If v linearly depends on {x,,..., x,} and each x; linearly depends 
on {y1,..., Ym}, then v linearly depends on {y,..., vm}. 

The Exchange Condition. _ If vector v linearly depends on {x, ..., x,} but not on 
{x1,...,Xn—1}, then x, linearly depends on {x1,..., Xn—1, v}. 


Thus linear dependence is a dependence relation as defined in Sect. 2.6.2. It fol- 
lows from Theorems 2.6.2 and 2.6.3 that maximal linearly independent sets exist and 
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span V. Such a set is then a basis of V, and so it follows immediately that a vector 
space over a division ring D is a free D-module. Next, by Theorem 2.6.4, any two 
maximal linearly independent sets in the vector space V have the same cardinality. 
Therefore this cardinality is an invariant of V and is called the dimension of V over 
D and is denoted dim p(V). Thus in particular, if V is a finitely generated D-module, 
then it is finite dimensional. 

The above discussion motivates the following general question. If M is a free 
R-module over the arbitrary ring R, must it be true that any two bases of M have the 
same cardinality? The answer is no, in general, but is known to be true, for example, if 
the right R-module Rp is Noetherian. (See Sect. 8.2 for the definition of Noetherian. 
For the proof of a basis in this case, see [1], An Introduction to Homological Algebra, 
Academic Press, 1979, Theorem 4.9, p. 111 [33].) If the ring R is commutative, then 
we have the following affirmative result: 


Theorem 8.1.11 Jf M is a free module over the commutative ring R, then any two 
bases of M have the same cardinality. 


Proof We may, using Zorn’s lemma, extract a maximal ideal J C R; thus F := R/J 
is a field. Furthermore, one has that M/M J is an R/J-module, 1.e., is an F'-vector 
space. Let B C M be an R-basis of M; as we have just seen that the dimension of a 
vector space is well defined, it suffices to show that the set B := {b + MJ | b € B} 
is an F-basis of M/MJ. First, it is obvious that B spans M/M/J. Next, an F-linear 
dependence relation >” bie gibi + MJ)(r; + J) = 0 translates into a relation of the 
form >) bir; € MJ. Therefore, there exist elements s; € J such that }° jr; = 
> b;5;; but as the elements b; are R-linearly independent, we infer that each r; = 5;, 
i.e., that each 7; € J. Therefore, each r; + J = 0 € F, proving that Bis F -linearly 
independent. 

This shows that the cardinality of any basis of the free-module M is equal to the 
dimension of the F'-vector space M/MJ. 


As a result of Theorem 8.1.11, we see that we may unambiguously define the 
rank of a free module over a commutative ring to be the cardinality of any basis. 


8.2 Consequences of Chain Conditions on Modules 


8.2.1 Noetherian and Artinian Modules 


Let M be aright R-module. We are interested in the poset P(M) := (P, S), of all 
submodules of M, partially ordered by inclusion. We already mentioned in Sect. 8.1.7 
that P(M) is a semimodular lower semi-lattice for the express purpose of activating 
the general Jordan-Holder theory. Actually, the fundamental homomorphism theo- 
rems of Sect.8.1.6 imply a much stronger property: 
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Theorem 8.2.1 The poset P(M) = (P, <) ofall submodules of a right R-module is 
amodular lattice with a minimal element 0 (the lattice “zero” ) and a unique maximal 
element, M (the lattice “one”’). 


Recall from Sect. 2.5.4 that a lattice L is modular if and only if 
(M) a => bimpliesa A (bV c) = bV (a Ac) forall a,b, cin L. 
This condition is manifestly equivalent to its dual: 

(M*) a < bimpliesav (bAc) =bA (aV c) foralla, b,c in L. 
In the poset P(M) the condition (M) reads: 


If A, B, and C are submodules with B contained in A, then 
AN(B+C)=B+(ANC). 


This is an easy argument, achieved by showing that an arbitrary element of one side 
is an element of the set described by the other side. 

We say that the right R-module M is right Noetherian if and only if the poset 
P(M) satisfies the ascending chain condition (ACC) (See Sect. 2.3.3). Recall from 
Lemma 2.3.4 that this is equivalent to saying that any nonempty collection C of 
submodules of M has a maximal member—that is, a module not properly contained 
in any other module in C. This formulation will be extremely useful in the sequel. 
Similarly, we say that a right R-module M is Artinian if and only if the poset P(M) 
satisfies the descending chain condition (DCC). By Lemma 2.3.5 this is equivalent 
to the assertion that any nonempty collection of submodules of M must contain a 
minimal member. 

Of course, there are left-module versions of these concepts: leading to “left- 
Noetherian’, and “left-Artinian” modules. When it is clear that one is speaking in the 
context of (left) right-R-modules, to say that a module is Noetherian is understood 
to mean that it is (left) right Noetherian. Similarly, “Artinian” means “(left) right 
Artinian” when speaking of (left) right R-modules. 

Finally, the chain conditions enjoy certain hereditary properties: 


Lemma 8.2.2 (i) Suppose N is a submodule of the right R-module M. Then M 
is Noetherian (or Artinian), if and only if both M/N and N are Noetherian 
(Artinian). 

(ii) Suppose M = >~"_, Nj, a finite sum of its submodules. Then M is Noetherian 
(Artinian) if and only if each N; is Noetherian (Artinian). 

(iii) Let {A;|i = 1,...,n} be a finite family of submodules of M. Then MA, 
... An) is Noetherian (Artinian) if and only if each factor module M/A; is 
Noetherian (Artinian). 


Proof These results are consequences of three facts: 


1. The poset P(M) = (P, <) of all submodules of a right R-module M is a modular 
lattice with a minimal element 0 (the lattice “zero”’) and a unique maximal element, 
M (the lattice “one’”’) (Theorem 8.2.1). 
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2. Purely lattice-theoretic results concerning chain conditions in a modular lattice 
given in Sect. 2.5.4—in particular, Lemmas 2.5.6 and 2.5.7. 

3. The Correspondence Theorem for Modules (Theorem 8.1.3, part (i)) which pro- 
duces a isomorphism between the principle filter P(M)" and the poset P(M/N) 
of submodules of the factor module M/N. 


In Exercises (2—4) in Sect. 8.5.2, respectively, the reader is asked to supply formal 
proofs of the three parts of the Lemma using the facts listed above. 


We say that the module M has a finite composition series if and only if there is a 
finite unrefinable chain of submodules in P(M) proceeding from 0 to M. Now since 
P(M) is a modular lattice, so are each of the intervals [A, B] := {X € P(M)|A < 
X < B}. Moreover, from the modular property, we have that any interval [A, A+ B] 
is poset isomorphic to [AM B, B]. We then have: 


Lemma 8.2.3 The following three conditions for an R-module M are equivalent: 


(i) M has a finite composition series. 
(ii) M is both Noetherian and Artinian. 
(iii) Every unrefinable chain of submodules if finite. 


Proof Note first that (i) implies (iii) by Theorem 2.5.5. Next, as any chain of submod- 
ules is contained in an unrefinable chain by Theorem 2.3.1, we see immediately that 
(iii) implies (ii). Finally, assume condition (ii). Then as M is Artinian, M contains a 
minimal submodule, say M,. Again, as M is Artinian, the set of submodules of M 
properly containing M, has a minimal member, say M2. We continue in this fashion 
to obtain an unrefinable chain0 C M; € M2 C.... Finally, since M is Noetherian, 
this unrefinable chain must stabilize, necessarily at M, which therefore produces a 
finite composition series for M. 


8.2.2 Effects of the Chain Conditions on Endomorphisms 


The Basic Results 


Let f : M — M beanendomorphism of the right R-module M. As usual, we regard 
f as operating from the left. Set f° to be the identity mapping on M, f! := f and 
inductively define f” := f o f"~!, for all positive integers n. Then f” is always 
an R-endomorphism, and so its image M, := f"(M) and its kernel Ky, := ker(f”) 
are submodules. We then have two series of submodules, one ascending and one 
descending: 


O=Ko< Kis ko<... 
M=Myj)>M,>M2>... 
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Lemma 8.2.4 Let f be an endomorphism of the right R-module M and let Ky, := 
ker(f”) and M, := f"(M) as above. The following statements hold: 


(i) My = Mn+ implies M = My, + Kn. 
(ii) Ky = Kn+i implies M, 1 Ky, = 0. 
(iii) If M is Noetherian, then for all sufficiently large n, M,N Ky = 0. 
(iv) If M is Artinian, then M = M;, + Ky for all sufficiently large n. 
(v) If M is Noetherian and f is surjective, then f is an automorphism of M. 
(vi) If M is Artinian and f is injective, then again f is an automorphism of M. 


Proof (i) Since M, = f(M,) we have M, = Mp),. Suppose x is an arbitrary 
element of M. Then f"(x) = f a (y), for some element y. But we can write x = 
t°O) + — f"(y)). Now the latter summand is an element of K, while the former 
summand belongs to M,,. Thus x € M, + Ky, proving (1). 

For (ii), suppose K, = Ky+1, so that K, = K2,. Choose x € Ky, My. Then 
x = f"(y) for some element y, and so 


O= f"(x) = f(y). 


But as Ko, = Ky, fy) = Oimplies f"(y) = 0 = x. Thus K,N M, = 0. 

(iii) If M is Noetherian, the ascending series Kg < K, < ... must stabilize at 
some point, so there exists a natural number n such that K, = K,+. By part (ii), 
My, Ky = 0, and so (iil) holds. 

(iv) If M is Artinian, the descending series Mp) > M; => M2... eventually 
stabilizes and so we have M,, = M,,+1 for sufficiently large n. Now apply part (i). 

(v) Note that if f is surjective, M = Mo = M, for all. But M is Noetherian, so 
by part (iti), MK, = 0 for some n. Thus K, = 0. But since the K; are ascending, 
K, < K, = 0, which makes f injective. Hence f is a bijective endomorphism, that 
is, an automorphism, whence (v). 

(vi) Finally, if f is injective,O = K, for all n; since M is Artinian, M, = Mn+1 
for sufficiently large n. Apply part (iv) to conclude that M = M, + Kn, ie., that 
M = M,,. Since M, => M,, we have, a fortiori, that M = M, proving that f is 
surjective. Hence f is a bijection once more, proving (vi). 


Corollary 8.2.5. Suppose M is both Artinian and Noetherian and f : M > M is 
an endomorphism. Then for some natural number n, 


M=M,® Ky. 
Schur’s and Fitting’s Lemmas 
Schur’s Lemma and Fitting’s lemma both address the structure of the endomor- 
phism ring Endr(M) of an R-module M. First of all, recall from Sect. 8.1.7 that an 


R-module M # 0 is said to be irreducible if M has no nontrivial proper 
R-submodules. 
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Lemma 8.2.6 (Schur’s Lemma.) /f M is an irreducible R-module, then Endr(M) 
is a division ring. 


Proof If 0 4 ¢ € Endr(M), then as ker ¢ and im ¢ are both submodules of M, we 
conclude that ¢: M — M must be bijective, and hence possesses an inverse, 1.e., 
Endr(M) is a division ring. 


Next, a module M is said to be indecomposable if and only if it is not the direct 
sum of two non-trivial submodules. Fitting’s Lemma generalizes Schur’s lemma, as 
below: 


Theorem 8.2.7 (Fitting’s Lemma.) Suppose M # 0 is an indecomposable right 
R-module having a finite composition series. Then the following hold: 


(i) For any R-endomorphism f : M — M, f is either invertible (that is, f is a 
unit in the endomorphism ring S := Endr(M)) or f is nilpotent (which means 
that f” = 0, the trivial endomorphism, for some natural number n). 

(ii) The endomorphism ring S = Endr(M) possesses a unique maximal right ideal 
which is also the unique maximal left ideal, and which consists of all nilpotent 
elements of S. 


Proof (i) First of all, by Lemma 8.2.3, M has a finite composition series if and only if 
itis both Artinian and Noetherian. It follows from Corollary 8.2.5 that for sufficiently 
large n, M = M,, @ Ky. Since M is indecomposable, either M, = 0 or Ky, = 0. In 
the former case f is nilpotent. In the latter case f is injective, and it is also surjective 
since M = M,,. Thus f is an automorphism of M and so possesses a two-sided 
inverse in S. 

(ii) It remains to show that the nilpotent elements of S form a two-sided ideal, 
and to prove the uniqueness assertions about this ideal. 

Suppose f is nilpotent. Since M ¢ 0, Ki = ker f 4 0, and so for any endomor- 
phism s € S, neither sf nor f's is injective. So, by the dichotomy imposed by Part 
(i), both sf and fs must be nilpotent. 

Next, assume that f and g are nilpotent endomorphisms while h := f + g is not. 
Then by the dichotomy, / possesses a 2-sided inverse h~!. Then 


idy = hh! = fh! + gh}, 


where idjy denotes the identity mapping M — M. Since s = idy —t, it follows that 
the two endomorphisms, s := fh! and t := gh~!, commute and, by the previous 
paragraph, are nilpotent. Then there is a natural number m large enough to force 
s” = t™ = 0. Then 


2m 
idy = (idy)?" = (5 +12" = i) ae 


j=0 a 


Each monomial s?”~/t/ = 0 since at least one of the two exponents exceeds m — 1. 
Therefore, idjy = 0, an absurdity as M ¥ 0. Thus A must also be nilpotent, and 
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so the set nil(S) of all nilpotent elements of S is closed under addition. From the 
conclusion of the previous paragraph this set of nilpotent elements nil(S) forms a 
two-sided ideal. 

That this ideal is the unique maximal member of the poset of right ideals of S 
follows from the fact that S — nil(S) consists entirely of units. A similar statement 
holds for the poset of left ideals of S. The proof is complete. 


The Krull-Remak-Schmidt Theorem 


The Krull-Remak-Schmidt Theorem asserts that any Artinian right R-module can be 
expressed as a direct sum of indecomposable submodules, and that this 
decomposition is unique up to the isomorphism type and order of the summands. We 
begin with a lemma that will be used in the uniqueness aspect. 


Lemma 8.2.8 Let f : A = A; ® Az > B = B, © By be an isomorphism of 
Artinian modules, and define homomorphisms a: A, — By and 3: A, — Bz by 
the equation 


f(a, 0) = (a(ai), B(a1)) 
for all a, € Ay. If a is an isomorphism, then there exists an isomorphism 
g:A2— Bo. 


Proof Clearly a = 7 0 f o py and 8 = 72 0 f o py, where 7; is the canonical 
coordinate projection B — B;, j = 1,2, and 1; is the canonical embedding 
A, — A, @ Az sending a, to (a1, 0). 

Suppose first that 6 = 0. In this case, the composition 72 0 f : A > Bo is 
surjective and has kernel precisely A; @ O < A and hence induces an isomorphism 
7.0 f : Az = A/(A, 80) > Bo. 

So our proof will be complete if we can construct from f another isomorphism 
h : A — B such that h(a;,0) = (a(a;), 0) for all ay € Ay. We define h in the 
following way: if f(a), a2) = (b1, b2), set 


h(a1, a2) = (b1, bo — Bag! (b1)). 


Clearly h is an R-homomorphism, and is easily checked to be injective. Now ho f—! : 
B -> Bis anendomorphism of an Artinian module. Since h is injective, soisho f—!. 
Then by Lemma 8.2.4, part (vi), ho f~! is surjective which clearly implies that h is 
surjective. Since h(a, 0) = (a(a1), 0), the proof is complete. 


The next lemma addresses the existence aspect of the Remak-Krull-Schmidt 
Theorem: 


Lemma 8.2.9 Suppose A is an Artinian right R-module. Then A is a direct sum of 
finitely many indecomposable modules. 
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Proof Let C be the family of submodules C C A for which the statement of the 
lemma is false. We shall assume, by way of contradiction, that C 4 Y. Since A is 
Artinian, C has aminimal member, say C € C. Clearly, C is not itself indecomposable, 
and so must itself decompose: C = C’ @ C”, for proper nontrivial submodules 0 < 
Cc’, C” < C. By minimality of C € C, we infer that both C’ and C” must decompose 
into direct sums of indecomposable submodules. But then so does C = C’ ®C”,a 
contradiction. Thus, C = and the proof is complete. 


The above preparations now lead to: 


Theorem 8.2.10 (The Krull-Remak-Schmidt Theorem) Jf A is an Artinian and 
Noetherian right R-module, then A decomposes into a direct sum of finitely many 
indecomposable right R-modules. This decomposition is unique up to the isomor- 
phism type and order of the summands.! 


Proof By Lemma 8.2.9 we know that A admits such a decomposition. As for 
the uniqueness, assume that A, A’ are isomorphic right R-modules admitting 
decompositions 


A=A,®A2@---@Am, A’ = A, @A,O@---@Ai, 


where the submodules A; € A, A’, C A’ are indecomposable submodules. We may 
assume that m < n and shall argue by induction on m that m = n and that (possibly 
after rearrangement) A; =r Ai, | ee area / 

Assume that f : A — A’ is the hypothesized isomorphism, and denote by 
Tt: A> Aj, wv : A’ => A’, Mi : Aj > A, bi : A’ — A’ the canonical 
projection and coordinate injection homomorphisms. We now define the R-module 
homomorphisms 


a,i=7,0 fom: Aj > A}, 
a: = ajo flop: A, > Ai, 


i =1,2,...,m. One then checks that 


m 
/ 
> Qj OO; 


i=1 


is the identity mapping on the indecomposable module B,. By Fitting’s Lemma 
(Theorem 8.2.7) not all the a; 0 a’, can be nilpotent. Reindexing if necessary, we 
may assume q © q is not nilpotent. Then it is in fact an automorphism of A‘, and 
so ay: A, —> A‘ is an isomorphism. Furthermore, we have 


'Here is a precise rendering of the uniqueness statement: Let A = A, ®--- @® A,, be isomorphic 
to B = B; ©--- @ By where the A; and B; are indecomposable. Then m = n and there is a 
permutation p of the index set J = {1, ...,} such that A; is isomorphic to B,,j) for alli ¢€ J. 


8.2 Consequences of Chain Conditions on Modules 293 
f(a,0,...,0) = (ai(a),...) € A’ 
Applying Lemma 8.2.8, there is an isomorphism 
Ar ®@---PBAm = AS @--- PA). 
Now we apply the induction hypothesis to deduce that m = n, and that for some 


permutation o of J = {2,...,m}, Aj; is isomorphic to Any i € J. The result 
follows. 


8.2.3 Noetherian Modules and Finite Generation 


We can come to the point at once: 


Theorem 8.2.11 Suppose M is a right R-module, for some ring R. Then M is right 
Noetherian if and only if every right submodule of M is finitely generated. 


Proof Assume that M is a Noetherian right R-module, and let N C M be a sub- 
module. Let NV be the family of finitely-generated submodules of N; since M is 
Noetherian we know that \V must have a maximal member No. If No © N, then 
there exists an element n € No\N. But then No + nR is clearly finitely generated; 
since No © No +nR C N, we have an immediate contradiction. Therefore, N must 
be finitely generated. 

Conversely, assume that every submodule of M is finitely generated, and let 


Co 
M, © M2 C.--- beachain of submodules of M. Then it is clear that Ma := LU M; 
i=1 
is a submodule of M, hence must itself be finitely generated, say My = x;R+ 
x2R-++-+X,R. But since x1,...,xX, € U Mi, we infer that there must exist some 
integer m with x1,...,X, € Mm, which guarantees that My, = Mn4i1 = .... Since 
the ascending chain M, C M2 C --- was arbitrary, M is by definition Noetherian. 


8.2.4 E. Noether’s Theorem 


The reader shall soon discover that many properties of a ring are properties which 
are lifted (in name) from the R-modules Rr or rR. We say that a ring R is a 
right Noetherian ring if and only if Rr is a Noetherian right R-module. In view of 
Theorem 8.2.11 we are saying that a ring R is right Noetherian if and only every 
right ideal is finitely generated. In an entirely analogous fashion, we may define the 
concept of a left Noetherian ring. 

Note that there is a huge difference between saying, on the one hand, that a module 
is finitely generated, and on the other hand, that every submodule is finitely generated. 
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Indeed if R is a ring, then the right R-module Rp is certainly finitely generated (viz., 
by the identity 1 € R). However, if Rr is not right Noetherian, then there will exist 
right ideals of R that are not finitely generated. 

However, if the ring R is right Noetherian, then this difference evaporates: this is 
the content of Noether’s Theorem. 


Theorem 8.2.12 (E. Noether’s Theorem) Let R be a right Noetherian ring, and 
let M be a right R-module. Then M is finitely generated if and only if M is right 
Noetherian. 


Proof Obviously, if M is Noetherian, it is finitely generated. Conversely, we may 
write M = x, R+x2R+---+2,R and each submodule x; R, being a homomorphic 
image of Rr, is a Noetherian module. Therefore, M is Noetherian by virtue of 
Exercise (2) in Sect. 8.5.2. 


Although the above theorem reveals a nice consequence of Noetherian rings, 
the applicability of the theorem appears limited until we know that such rings are 
rather abundant. One very important class of Noetherian rings are the principal ideal 
domains. These are the integral domains such that every ideal is generated by a single 
element, and in Chap. 10, we shall have quite a bit more to say about modules over 
such rings. The basic prototype here is the ring Z of integers. That this is a principal 
ideal domain can be shown very easily, as follows. Let 0 4 J C Z be an ideal and 
let d be the least positive integer contained in J. If x € J, then by the “division 
algorithm” (Lemma 1.1.1) there exist integers g andr with x = gd +r and where 
0 <r <d. Since r = x — qd € I, we conclude that by the minimality of d, r = 0, 
i.e., x must be a multiple of d. Therefore, 


Lemma 8.2.13 The ring of integers Z is a principal ideal domain. 


Note that since an integral domain is by definition commutative, it does not matter 
whether we are speaking of right ideals or left ideals here. 

This argument used only certain properties shared by all Euclidean Domains 
(see Sect.9.3). All of them are principal ideal domains by a proof almost identical 
to the above. But principal ideal domains are a larger class of rings, and most of 
the important theorems (about their modules and the unique factorization of their 
elements into primes) encompass basically the analysis of finite-dimensional linear 
transformations and the elementary arithmetic of the integers. So this subject shall 
be taken up in much greater detail in a separate chapter (see Chap. 10). Nonetheless, 
as we shall see in the next subsection, just knowing that Z is a Noetherian ring, yields 
the important fact that “algebraic integers” form a ring. 


8.2.5 Algebraic Integers and Integral Elements 


Let D be a Noetherian integral domain and let K bea field containing D. An element 
a € K is said to be integral over D if and only if a is the zero of a polynomial in 
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D[x] having lead coefficient 1. In this case, of course, some power of a is a D-linear 
combination of lower powers, say 


Th ++4+ ana" + aia + ao, 


a” = ay_\ a" 
where n is a positive integer and the coefficients a; lie in D. 

In the particular case that D = Z and K is a subfield of the complex numbers, 
then any element a € K that is integral over Z is called an algebraic integer. Thus 
4/3 is an algebraic integer (as it is a zero of the polynomial x* — 3 € Z[x]), but one 
can show, however, that 5V3 is not. The complex number w = (1 — /—3)/2 is an 
algebraic integer because w? + w + 1 =0. 

Their is a simple criterion for an element of K to be integral over D. 


Lemma 8.2.14 Let K be a field containing the Noetherian integral domain D. Then 
an element a of K is integral over D if and only if the subring D[a] is a finitely 
generated D-module. 


Proof First suppose a is integral over D. We must show that D[a] is a finitely 
generated D-module. By definition, there exists a positive integer n such that a” is a 
D-linear combination of the elements of X := {1, a, OP, sss; alah), Let M = (X)p 
be the submodule of D[a] generated by X. Now aX contains a” together with lower 
powers of a all of which are D-linear combinations of elements of X. Put more 
succinctly, Ma C M. It now follows that all positive integer powers of a lie in M, 
and so M = D{[a]. Since M is generated by the finite set X, D[a] is indeed finitely 
generated. 

Now suppose D[a] is a finitely generated D-module. We must show that a is 
integral over D—that is, a is a zero of a monic polynomial in D[x]. We now apply 
Noether’s Theorem (Theorem 8.2.12) to infer that, in fact, D[a] is a Noetherian 
D-module. But it contains the following chain of submodules: 


D<D+aD<D+oaD+oe"D<... 
and the Noetherian condition forces this chain to terminate, say at 
D+aD+---+a""'D. (8.18) 
Then a”D must lie in the module presented in (8.18) and so the element a” is 


a D-linear combination of lower powers of a. Of course that makes a integral 
over D. 


The main thrust of this subsection is the following: 


Theorem 8.2.15 Let K be a field containing the Noetherian integral domain D and 
let O denote the collection of all elements of K that are integral over D. Then O is 
a subring of K. 
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Proof Using the identity (—a)/ = (—1)/a/, it is easy to see that if a is the zero of 
a monic polynomial in D[x], then so is —a. Thus —O = O. 

It remains to show that the set O is closed under the addition and multiplication 
operations of K. Let a and ( be any elements of O. Then there exist positive integers 
nand m such that the subring D[a] is generated (as a D-module) by powers a!, i <n 
and similarly the D-module D[@] is generated by powers (3/, j < m. Thus every 
monomial expression a* 3° in the subring D[qa, 3] is a D-linear combination of the 
finite set of monomial elements {a/ 3/|i <n, j < m}. It follows that the D-module 
D{a, {] is finitely generated. Now since D is a Noetherian ring, Theorem 8.2.12 
implies that D[a, 3] is Noetherian. 

Now D[a+ 3] and D[ a] are submodules of the Noetherian moduleD[a, (] and 
so are themselves Noetherian by Theorem 8.2.11. But then, invoking once again the 
fact that D is a Noetherian ring, we see that D[a + (] and D[a3] are both finitely 
generated D-modules, and so are in O by Lemma 8.2.14. Thus © is indeed a subset 
of the field K which is closed under addition and multiplication in K. 


Since the algebraic integers are simply the elements of the complex number field C 
which are integral over the subring Z of integers, we immediately have the following: 


Corollary 8.2.16 The algebraic integers form a ring. 


In the opinion of the authors, the spirit of abstract algebra is immortalized in this 
maxim: 


e a good definition is better than many lines of proof. 


Perhaps nothing illustrates this principle better than the proof of Theorem 8.2.15 just 
given. In the straightforward way of proving this theorem one is obligated to produce 
monic polynomials with coefficients in D for which a + ( and a are roots; not an 
easy task. Yet the above proof was not hard. That was because we had exactly the 
right definitions! 


8.3 The Hilbert Basis Theorem 


This section gives us a second important application of Theorem 8.2.12. 

In this section we consider polynomial rings of the form R[x1, x2, ..., Xn], where 
X1,X2,...,Xn are commuting indeterminates. In the language of Sect.7.3, these 
are the monoid rings RM where M is the free commutative monoid on the set 
{x1, X2,..., Xn}. However, and in contrast to most textbooks, we shall not require that 
the coefficient ring R be commutative. We recall that by Exercise (3) in Sect. 7.5.3, 
the iterated polynomial ring R[x][y] can be identified with R[x, y]. Since R is not 
assumed to be commutative, we see a fortiori that the polynomial ring R[x] may not 
be commutative. However, we hasten to remind the reader that the indeterminate x 
commutes with every element of R. As a result of this, we infer immediately that 
R[x] inherits a natural right R-module structure. 
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Theorem 8.3.1 (The Hilbert Basis Theorem) /f R is right Noetherian, then R{x] is 
also right Noetherian. 


Proof It suffices to show that if A is a right ideal of R[x], then A is finitely generated 
as a right R[x]-module. Let A* be the collection of all leading coefficients of all 
polynomials lying in A; that is 


m 
At := } cp | there exists f(x) = De eA, deg f(x) =m}$ CR. 
j=0 
Since A is a right ideal in the polynomial ring R[x], AT is aright ideal in R. Since R 
is right Noetherian, we know that A* must be finitely generated as a right R-module, 
sO We may write 
At =a;R+a.R+-:-+a,R 
for suitable elements a1, a2...,d, € At. From the definition of At, there exist 
polynomials p;(x) in A having lead coefficients aj, i = 1,2,...,n. Let nj := 
deg p;(x) and set m := max{n;}. Now let R,»,[x] be the polynomials of degree at 
most m in R[x] and set 
Am = AN R[x]. 

We claim that 

A= Am + pix) REX] + pox) RE] + +++ + pn) RIX]. (8.19) 
Let f(x) be a polynomial in A of degree N > m and write 

f(x) =ax™ + terms involving lower powers of x. 
Then, as a is the leading coefficient of f(x) we have a € A*, and so 
= ar] + 422 +++ Ann 
for suitable rj, r2,..., 1%, € R. Then 
Fix) = fx) — x8 py (xyry — XN po (x)ry — 2 — XN pa (x)rn 

is a polynomial in A which has degree less than N. In turn, if the degree of f) (x) 
exceeds m this procedure can be repeated. By so doing, we eventually obtain a 
polynomial h(x) in A of degree at most m, which is equal to the original f(x) minus 


an R[x]-linear combination of {p1(x), p2(x), ..., Pn(x)}. But since each p; (x) and 
f(x) all lie in A, we see that h(x) € A,,. Thus 


f(x) € Am + pix) R[x] +--+ pn (x) REX]. 
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Since f(x) was an arbitrary member of A, Eq. (8.19) holds. 

In view of Eq. (8.19), it remains only to show that A,, R[x] is a finitely generated 
R[x]-module, and this is true if A,, is a finitely generated R-module. But A,, is 
an R-submodule of R[x] which is certainly a finitely generated R-module (for 
{l, x,...x’"} is a set of generators). By Noether’s Theorem (Theorem 8.2.12) Rm [x] 
is Noetherian and hence so is Aj. Thus, A; is a finitely-generated right R-module, 
and the proof is complete. 


Corollary 8.3.2 If R is right Noetherian, then so is R[x, ..., Xn]. 


Remark As expected, there is an obvious left version of the above result. Thus, if R 
is left Noetherian, than R[x] is also a left Noetherian ring. 


8.4 Mapping Properties of Modules 


In this section we shall consider the “calculus of homomorphisms” of modules— 
the idea that constructions can be characterized by mapping properties. Such an 
approach, leads to the important notions of exact sequences, projective and injective 
modules, and other notions which feed the rich lore of homological algebra. 


8.4.1 The Universal Mapping Properties of the Direct Sum 
and Direct Product 


Characterizing the Direct Sum by a Universal Mapping Property 


Recall that the formal direct sum over a family {M,|o € 1} of right R-modules is the 
set of all mappings f : J — U,Mo, with f(a) € Mo, and f(a) non-zero for only 
finitely many indices a. We think of the indices as indicators of coordinates, and the 
value f(c) as the o-coordinate of the element f. This is an R-module 6, M, under 
the usual coordinate-wise addition and application of right operators from R. 

There is at hand a canonical collection of injective R-homomorphisms: ko : 
M, — ®oM, defined by setting 


dg ifo =T 


(Faldo) =} 4" Gey a» 


where a, is an arbitrary element of M,. That is, &, takes an element a of M, to 
the unique element of the direct sum whose coordinate at o is a, and whose other 
coordinates are all zero. 


Theorem 8.4.1 Suppose M = @ge1M, is the direct sum of modules {M,|o € 1} 
with canonical homomorphisms kz : M, — M. Then, for every module B, and 
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family of homomorphisms ¢, : Mz — B, there exists a unique homomorphism 
o:M > B such that $0 Kg = bo forall o € I. In fact, this property characterizes 
the direct sum up to isomorphism. 


Proof We define ¢ : M — B by the equation 


$a) = D1 bo(a(o)). 


Since a(a) € Mg, by definition, and only finitely many of these summands are non- 
zero, the sum describes a unique element of B. It is elementary to verify that ¢, as 
we have defined it, is an R-homomorphism. Moreover, if a; is an arbitrary element 
of M,, then 


0° K,(a7) = (Kr (a,)) = er 27 (kr (ar (0))) = rar. 


Thus we have ¢0 k; = @,, for all rT. 
Now suppose there were a second homomorphism \ : M — B, with the property 
that \o kK, = ¢,, for all 7 € J. Then 


Ma) =D) Akola) =D) _ ba(a(o)) = (a). 


Thus A = ¢. So the homomorphism ¢ which we have produced is unique. 

Now assume that the R-module N, and the collection of morphisms A, : M, > N 
satisfy the universal mapping conditions of the last part of the theorem. 

First apply this property to M when B = N and ¢, = Aq. Then there is a 
homomorphism \: M — N with \o kg = Ag. 

Similarly, there is a homomorphism « : N — M for which & 0 Ag = Kg. It 
follows that 

KOKg =KO0Ag = Kg = 1M Oko, 


where 1 jy is the identity morphism on M. So by the uniqueness of the homomor- 
phism, one may conclude that &o A = | y. Similarly, reversing the roles, Xow = ly. 
Therefore & is an isomorphism. The proof is complete. 


We have seen from our earlier definition of the direct sum M = @o<)M, that 
there are projection mappings 7, : M — M, which, for any element m € M, reads 
off its o-coordinate, m(c). One can see that these projection mappings are related to 
the canonical mappings kK, by the equations 


Tj © Kj = | (the identity mapping on M;) (8.20) 
TOK; =Oifi Fj. (8.21) 


We remark that the existence of the projection mappings 7; with the compositions 
just listed could have been deduced directly from the universal mapping property 
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which characterizes the direct sum. Let 6;; be the morphism M; — Mj which is 
the zero map if i # /, but is the identity mapping on M; if i = j. If we apply the 
universal property with Mj; and 6;; in the respective roles of B and ¢;, we obtain the 
desired morphism ¢ = 7; : M — Mj with 7; 0 Kj = 4j;. 


Characterizing the Direct Product by a Universal Mapping Property 


We can now characterize the direct product of modules by a universal property which 
is the complete dual of that for the direct sum. 


Theorem 8.4.2 Suppose M = [[,<;Mo is the direct product of the modules 
{M,|o € I} with canonical projections tz : M — Mg. Then, for every module 
B, and family of homomorphisms ¢; : B — Mj there exists a unique homomor- 
phism @ : B > M such that 7; 0 ¢ = 9j, for eachi € I. This property characterizes 
the direct product up to isomorphism. 


Proof The canonical projection mapping 77;, recall, reads off the i-th coordinate of 
each element of M. Thus, if m € M,m is a mapping I — UM, with m(i) € Mj, 
and 7;(m) := m(i). 

Now define the mapping ¢: B — M by the rule: 


Pb) G) == Gi (0), 


for all (¢,b) € I x B. Then certainly ¢ is an R-homomorphism. Moreover, one 
checks that 


(m7; 0 @)(b) = 73 (G(D)) = (GC) DO = Gil). 


Thus Tj O 0) = di. 
Now if ~ : B + M were an R-homomorphism which also satisfied 7; 0 7 = ¢;, 
for alli € J, then we should have 


(YO)@ = miYB) = (TH OW) = Gi) = GON), 


at each i, so U(b) = f(b) at each b € B—i.e. W = ¢ as mappings. 

Now suppose N were some R-module equipped with morphisms z’; : N > Mj, 
i € I satisfying this mapping property: whenever there is a module B with morphisms 
oj : B > Mi, then there is a morphism ¢g : B > N such that 7’; o dg = ¢j. 

First we use the property for M. Setting B = N and ¢; = 7';, we obtain (in the 
role of ¢) a morphism 7’ : N — M such that 7; 0 7’ = 7';. 

Similarly, using the same property for N (with M and 7; in the roles of B and the 
oj), we obtain a homomorphism 7 : M — N such that 7’; 0 7’ = 7;. It follows that 


/ / 
Ton onm=Tjon=7 =7;,0 ly. 
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By the uniqueness, one has 7’ o 7 = Ly. Similarly 7 0 7’ = Ly. Therefore 7 is an 
isomorphism. 

As a final remark, the universal mapping property can be used to show the existence 
of mappings «; : M; > [],<,;Mji such that 7; 0; is the identity map on M; ifi = j, 
and is the zero mapping M; — M; ifi # j. 


8.4.2 Homr(M, N) 


The Basic Definition 


We shall continue to assume that R is an arbitrary ring with identity element Lp. If 
M, N are right R-modules, then we may form the abelian group Hom,r(M, N) con- 
sisting of R-module homomorphisms M — N. Indeed, the abelian group structure 
on Homr(M, N) is simply given by pointwise addition: if ¢,? € Homr(M, N), 
then (p + @)(m) := d(m) + O(m) € N. 


Homr(M, N) Is a Left R-Module 


We remark here that Homr(M, N) possesses a natural structure of a left R-module: 
For each ring element r € R and homomorphism ¢ € Homr(M, N), the mapping 
r- dis defined by 

(r - &)(m) := b(mr) = o(m)r 


Notice, by this definition, that forr,s €¢ R, andm € M, 


(r - (s+ p))(m) = (s- b)(mr) = (P(mr))s (8.22) 
= (o(m)r)s = o(m)(rs) = (rs) - 6) (mn). (8.23) 


In the preceding equations left multiplication of maps by ring elements was indi- 
cated by a “dot”. The sole purpose of this device was to clearly distinguish the 
newly-defined left ring multiplications of maps from right ring multiplication of 
the module elements in the equations. However, from this point onward, the “dot” 
operation will be denoted by juxtaposition as we have been doing for left modules 
generally. 

Clearly r@é € Homr(M, N), and the reader can easily check that the required 
equations of mappings hold: 


r(gi + ¢2) =r¢gi+réo, 
(r+s5)o1 =r¢i+s¢1, and 
(rs)o1 = r(s¢1), 
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where 7,s € R and ¢ € Homr(M,N), i = 1,2. In particular, when R is a 
commutative ring, one may regard Homr(M, N) as a right R-module simply by 
passing scalars from one side to another. (For more about making Homr(M, N) into 
a module, see the Exercise Sect. 8.5.3.) 

In the special case that VN = Rp, and M is any right R-module, we obtain a left 
R-module M* := Homr(M, Rp) called the dual module of M. We shall have use 
for this latter, especially when R is a field. 


Basic Mapping Properties of Homr (MV, N) 


We note that Home has the following mapping properties. If a : M; — Mp2 and 
3: N, — Ny» are homomorphisms of right R-modules, then we have an induced 
homomorphism of abelian groups, 


Hom(a, ?) : Homr(M2, N\}) > Homr(M, N2), GH Bodgoa. (8.24) 


Let idjy denote the identity mapping M > M2 

The reader should have no difficulty in verifying that Hom(idy, a) really is a 
homomorphism of abelian groups. In particular, if we fix a right R-module M, then 
the induced homomorphism of abelian groups are: 


Hom(idy, 2) : Homr(M, N;) > Homr(M, N2), db Bod. (8.25) 


In a language that will be developed later in this book, this says that the assignment 
Homa, (M, e) is a “functor” from the “category” of right R-modules to the “category” 
of abelian groups. (See Sect. 13.2 for a full exposition.) Likewise, if N is a fixed right 
R-module, and if a : Mz — M;, is an R-module homomorphism, then we have an 
induced homomorphism of abelian groups 


Hom(a, idy) : Homr(M,, N) > Homer(M2, N), ot doa. 
This gives rise to another “functor,’ Homp(e, NV), from the category of right 


R-modules to the category of abelian groups, but is “contravariant” in the sense 
that the directions of the homomorphisms get reversed: 


p B 
M, & M2 gets mapped to Homr(M), N) > Homr(M2, N). 


>The notation “id” serves to distinguish the identity mapping on the set M from the multiplicative 
identity element 1 y of any ring or monoid which unfortunately happens to be named “M”’. 
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8.4.3 Exact Sequences 


Definition 


Let 6 : M — N be a homomorphism of right R-modules, and set M” := 
o(M), M' = ker ¢. Then we may build a sequence of homomorphisms 


Ms a Ss 


where yz : M’ <> M is just the inclusion homomorphism and where e(m) = ¢(m) € 
M" for all m € M. Furthermore, by definition, we notice the seemingly innocuous 
fact that kere = im p. 

More generally, suppose that we have a sequence of right R-modules and 
R-module homomorphisms: 


gi- 
> Mji-1 > M; : 


> Mi+1 Pores 


We say that the above sequence exact at M; if im@¢;-; = ker ¢;. The sequence 
above is said to be exact if it is exact at every M;. Note that the homomorphism 
ju: M — Nis injective if and only if the sequence 0 — M +, N is exact. Likewise 
e : M — N is surjective if and only if the sequence M + N = Ois exact. 
Of particular interest are the short exact sequences. These are exact sequences of 
the form 


03> MSM SM" 30. 


Note that in this case Theorem 8.1.2 simply says that M” ~ M/y(M"). 


The Left-Exactness of Hom 


The “hom functors” come very close to preserving short exact sequences, as follows. 


im 


Theorem 8.4.3 Let 0 > N’ —>» N —> N" be an exact sequence of right 
R-modules, and let M be a fixed R-module. Then the following sequence of abelian 
groups is also exact: 


Hom(1 yy, /4) Hom(1 y,€) 
> 


0 > Homr(M, N’) Homr(M,N) — >’ Homer(M, N”). 


Pe ee € : : 
Likewise, if M' "> M —> M" > Oisan exact sequence of right R-modules, and 


if N is a fixed R-module, then the following sequence of abelian groups is exact: 


0 > Hom_(m”, N) OES" Homa(M, N) ES? Homa(M’, N). 


264 8 Elementary Properties of Modules 
Proof Assume that 0 — N’ "> N —> N’ is exact and that M is a fixed 
R-module. First of all, note that if ¢’ € Hompr(M, N’), then since p is injec- 
tive, we see that 0 = Hom(M, 1)(¢') = fo @’ clearly implies that ¢’ = 0, 
and so Hom(M, ) : Homr(M, N’) > Home(M, N) is injective. Next, since 
Hom(M, «) o Hom(M, w) = Hom(M, uo 6) = Hom(M,0) = 0, we have 
im Hom(M, yw) C ker Hom(M, ¢). Finally, suppose that @¢ € ker Hom(M, e). Then 
€o0d=0: N > M" which says that for each n € N, d(n) € kere = imp, 
i.e. d(n) € imy for each n € N. But since wp : M’ > M is injective, we may 
write b(n) = p10 ¢'(n) for some uniquely defined element $’(n) € M’. Finally, 
since ¢’ : N — M’ is clearly a homomorphism, we are finished as we have proved 
that Hom(M, y:)(¢') = . The second statement in the above theorem is proved 
similarly. 


If the sequences in Theorem 8.4.3 were completed to short exact sequences, it 
would be natural to wonder whether the resulting sequences would be exact. This is 
not the case as the following simple counterexample shows. Consider the short exact 
sequence of Z-modules (i.e., abelian groups): 


(37373777 So, 


where ji2 : Z — Zis multiplication by 2: j2(m) = 2m, and where « is the canonical 
quotient mapping. The sequence below 


He 1, po s€ 
0 —> Homz(Z/Z2, Z) "Ss! Homz(z/Z2, Z) 2S? Homz(Z/Z2, Z/Z2) > 0 


is not exact as Homz(Z/Z2, Z) = 0, whereas Homz(Z/Z2, Z/Z2) = Z/Z2, and 
so Hom(Z/Z2, 6) : Homz(Z/Z2, Z) > Homz(Z/Z2, Z/Z2) certainly cannot be 
surjective. 

Thus, we see that the “hom functors” do not quite preserve exactness. The extent 
to which short exact sequences fail to map to short exact sequences—and the remedy 
for this deficiency is the subject of homological algebra. (In a subsequent chapter, 
we shall encounter another “functor” (namely that derived from the tensor product) 
from right R-modules to abelian groups which also fails to preserve short exact 
sequences). 


Split Exact Sequences 


Note that given any right R-modules M’, M”, we can always build the following 
rather trivial short exact sequence: 


y Fl 72 


0—> M’ —> M’@ M” —> M” > 0, 


where 11 1s inclusion into the first summand and 72 is projection onto the second 
summand. Such a short exact sequence is called a split short exact sequence. More 
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generally, we say that the short exact sequence 0 > M’ ’.M—> M’>0 splits 
if and only if there is an isomorphism a : M — M’ @ M” making the following 
ladder diagram commute?: 


(ta YO ee Gs 


0 M’ HI M'@ M" m2 M" 0 
In turn, the following lemma provides simple conditions for a short exact sequence 
to split: 


Lemma 8.4.4 Let ; 


0+ vw S& aw Sw’ = 0 (*) 
be a short exact sequence of right R-modules. Then the following conditions are 
equivalent: 


(i) There exists an R-module homomorphism o : M" — M such that e€oo = \yr. 
(ii) There exists an R-module homomorphism p: M —> M' such that po = Ly. 


The short exact sequence (*) splits. 


Proof Clearly condition (iii) implies both (i) and (11); we shall be content to prove 
that (i) implies (iii), leaving the remaining implication to the reader. Note first that 
since €0 0 = lyr, it follows immediately that co : M” — M@ is injective. In 
particular, 7 : M” — o(M”) is an isomorphism. Clearly it suffices to show that 
M = pu(M’) @ o(M") (internal direct sum). If m € M then e(m — o(e(m))) = 
e(m) — (€00)(E(m)) = e(m) — E(m) = 0, which says that m — o(e(m)) € kere = 
im yt. Therefore, there exists m’ € M’ with m — o(e(m)) = pm’), forcing m = 
pu(m') + o(e(m)) € u(M"') + o(M"). Finally, if for some m’ € M’, m” € M”, so 
that (m') = o(m"), then m” = e(a(m")) = e((m’)) = 0, which implies that 
LM") N o(M") = 0, proving that M = w(M’) @ o(M"), as required. 


8.4.4 Projective and Injective Modules 


Let R be a ring and let P be an R-module. We say that P is projective if every 
diagram of the form 


| 


M —> M” ——+ 0 (exact row) 


3Recall the definition of “commutative diagram” on p. 237. 
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can be embedded in a commutative diagram of the form 


P 


yr 


M ——> M” ——~> 0 


(exact row) 


Note first that any free right R-module is projective. Indeed, assume that F is a 
free right R-module with basis { fq | a € A} and that we have a diagram of the form 


F 
I 
M =) yl = 4 6 (exact row) 
For each m” := "(fa) € M",a € A, select an element my € M with 


€(Mq) = m'., a € A. As F is free with basis { f. | a € A}, there exists a (unique) 


homomorphism ¢ : F — M with é(fa) = ma, a € A. This gives the desired 
“lifting” of the homomorphism ¢” : 


F 


ye 


M ——~ M” —~ 0 


(exact row) 


We have the following simple characterization of projective modules. 


Theorem 8.4.5 The following conditions are equivalent for the R-module P. 


(i) P is projective. 
(ii) Every short exact sequence 0 + M' +> M > P = O splits. 
(iii) P is a direct summand of a free R-module. 


Proof That (i) implies (ii) follows immediately from Lemma 8.4.4. Now assume 
condition (ii) and let € : F — P be a surjective R-module homomorphism, where 
F is a free right R-module. Let K = ker € and consider the commutative diagram 
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where jt : K <> F, and where the homomorphism 0 : P > F exists by the projec- 
tivity of P. Apply Lemma 8.4.4 to infer that F = .(K) @ o(P); since o(P) = P. 
Thus (iii) holds. 

Finally, we prove that (iii) implies (i). Thus, let F be a free right R-module which 
admits a direct sum decomposition F = N @® P, where N C F is a submodule. 


From the diagram 
P 


le 


i= 
€ 


we obtain the following diagram. 


bb 
F=N@P P 
T 
0 ) 
M M’” —— > 0 


€ 


Here 7 : N © P — P is the projection of the sum onto the right coordinate, and 
uu: P > N@ Pis the injection onto 0 P so that 7 ow = 1p. The homomorphism 
0: F > M satisfying € o € = ¢o 7 exists since, as already observed above, free 
modules are projective. However, setting b:=00 Lt gives the desired lifting of ¢, 
and so it follows that P is projective. 


Dual to the notion of a projective module is that of an injective module. A right 
R-module J is said to be injective if every diagram of the form 


I 
0'| 
O—<——— i a (exact row) 


can be embedded in a commutative diagram of the form 


0 0 
[Lb 


0 —> M —~>M 


In order to obtain a characterization of injective modules we need a concept dual 
to that of a free module. In preparation for this, however, we first need the concept of 
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a divisible abelian group. Such a group D satisfies the property that for every d € D 
and for every 0 £n € Z, there is some c € D such that nc = d. 


Example 44 1. The most obvious example of a divisible group is probably the 
additive group (Q, +) of rational numbers. 

2. A moment’s thought should reveal that if F is any field of characteristic 0, then 
(F, +) is a divisible group. 

3. Note that any homomorphic image of a divisible group is divisible. Of paramount 
importance is the divisible group Q/Z. 

4. Let p bea prime, and let Z(p®™) be the subgroup of Q/Z consisting of elements 
of p-power order. Then Zp) is a divisible group. (You should check this. Note 
that there is a direct sum decomposition Q/Z = @Z(p™), where the direct 
sum is over all primes p). 

5. Note that the direct product of any number of divisible groups is also divisible. 


The importance of divisible groups is displayed by the following. 


Theorem 8.4.6 Let D be an abelian group. Then D is divisible if and only if D is 
injective. 


Proof Assume first that D is injective. Let d € D, and let 0 4 n € Z. We form the 
diagram 


D 


0 
Ln 


0 Z Z 
where j, : Z — Z is multiplication by n, 6’(m) = md, m € Z, and where 6 is the 
extension of 0’ guaranteed by the injectivity of D. If @(1) = c thennc = n@(1) = 
O(n) = O(n (1)) = 6’(1) = d, proving that D is divisible. 

Conversely, assume that D is divisible and that we are given a diagram of the 


form 
D 


| 
0—> a’ —" 5 4 


where yp: : A’ — A is an injective homomorphism of abelian groups. We shall, for 
the sake of convenience, assume that A’ < A via the injective homomorphism ju. Let 
P = {(A”, 6”)} where A’ < A” < A and where 6” : A” > D is a homomorphism 
with 6”| 4 = 0’. We make P into a poset via the relation (Aj, 0/) = (A, 05) if 
and only if AY < AS and 95 | 4” = 0/. Since (A’, 6’) € P, it follows that P ¥ @. If 
{(AQ, 0) }aeA is achain in P, we may set A” = ),<.4 AG, and define 0” : A” > D 
by setting 0’(a") = O" (a") where a” € A”. Since {(A%, O)}ac, is a chain, it 


ar a 
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follows that 0” (a’’) doesn’t depend upon the particular choice of index a € A with 
a” € A". Therefore, we see that the chain {(A”, 0") }ac, has (A”, 6”) as an upper 
bound. This allows us to apply Zorn’s lemma and infer the existence of a maximal 
element (Aj, 45) € P. Clearly the proof is complete once we show that Aj = A. 
Assume, by way of contradiction, that there exists an element a € A\A@, and let 
m be the order of the element a + Aj in the additive group A/Aj. We divide the 


argument into the two cases according to m = oo andm < o. 


m=oo. In this case it follows easily that Z(a) is an infinite cyclic group and that 
Z(a) 1 Ag = 0. As aresult, A” := Z(a) + Ap = Z(a) ® Aj and an extension of 
6) can be obtained simply by setting 6” (ka + aj) = 05, k € Z, aj € Ag. This 
is a well-defined homomorphism A” — D extending 0;. But that contradicts the 
maximality of (Aj, 94) in the poset P. 

m<oo. Here, we have that ma € Aj and so 0)(ma) € D. Since D is divisible, 
there exists an element c € D with mc = 65(ma). Now define 0” : A” := 
Z(a) + AG > D by setting 6” (ka + ag) = ke + 65(ag) € D. We need to show 
that 6” is well defined. Thus, assume that k,/ are integers and that aj, bj € AG 
are such that ka+ag = la+bg. Then (k—l)a = bj —ag € Ag andsok—l =rm 
for some integer r. Therefore 


kc — lc = rmc 

rO5 (ma) 

= 0) (rma) 

= 6505 — a) 

= 9 (bp) — 65(a9), 


which says that 
ke + 05 (ag) = Ic + 65(bp) 


and so 6” : A” —> Dis well defined. As clearly 6” is ahomomorphism extending 
5, we have once again violated the maximality of (Ag, 09) € P, and the proof 
is complete. 


Let R be a ring, and let A be an abelian group. Define M := Homz(R, A); thus 
M is certainly an abelian group under point-wise operations. Give M the structure 
of aright R-module via 


(f-r)(s) = f(rs), ns ER, fem. 


It is easy to check that the above recipe gives Homz(R, A) the structure of a right 
R-module. (See Exercise (1) in Sect. 8.5.3.) 
The importance of the above construction is found in the following. 


Proposition 8.4.7 Let R be a ring and let D be a divisible abelian group. Then the 
right R-module Homz(R, D) is an injective R-module. 
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Proof Suppose that we are given the usual diagram 


Homz(R, D) 
a] 
Q ——> M’' =e | (exact row) 


The homomorphism 6’ : M’ — Homz(R, D) induces the homomorphism ¢’ : 
M' —> Dby setting ¢’(m’) = 6’(m’)(1), for all m’ € M’. Since D is divisible, it is 
injective and so we may form the commutative diagram: 


D 


¢ od 


0 M’ M 
In turn, we now define 06: M — Homz(R, D) by setting 0(m)(r) := é(mr) € D. 
It is routine to check that 6 is a homomorphism of right R-modules and that the 


following diagram commutes: 


Homz(R, D) 
| ~~ 
Lb 
0 M’ M (exact row) 


Thus Homz is an injective R-module. 


Recall that any free right R-module is the direct sum of a number of copies of Rp, 
and that any R-module is a homomorphic image of a free module. We now define a 
cofree R-module to be the direct product (not sum!) of any number of copies of the 
injective module Homz(R, Q/Z). Note that by Exercise (12) in Sect. 8.5.3, a cofree 
module is injective. We now have 


Proposition 8.4.8 Let M be an R-module. Then M can be embedded in a cofree 
R-module. 


Proof Let 0 4 m € M and pick a nonzero element a(m) € Q/Z whose order 
o(m) in the additive group Q/Z is equal to that of m in the additive group M. (For, 
example, one could choose a(M) = (1/o(m)) + Z.) This gives an additive group 
homomorphism a : Z(m) > Q/Z given by a(km) = ka(m), for each integer k. 
As Q/Z is a divisible abelian group, there exists a homomorphism of abelian groups 
Qn: M > Q/Z extending a : Z(m) > Q/Z. In turn, this homomorphism gives 
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a homomorphism 3, : M — Homz(R, Q/Z) defined by setting 3,,(m')(r) = 
Qmn(m'r) € Q/Z. It is routine to verify that 3,, is indeed, a homomorphism of right 
R-modules. 

Next, we form the cofree module C := Tozm em Homz(R, Q/Z)m, where each 
Homz(R, Q/Z)m = Homz(R, Q/Z). Collectively, the homomorphisms 3,,, 0 4 
m € M, give a homomorphism yw: M —> C via 


p(m')(m) = Bm (m') € Homz(R, Q/Z). 


It remains only to show that ju is injective. Suppose m were a non-zero element in 
ker yz. Then px(m)(m') = 0 for all 0 4 m' € M. But this says in particular that 
0 = Bm(m) € Homz(R, Q/Z), forcing a»(m) = Bm(m)(1) = 0. That assertion, 
however, produces a contradiction since a, (m) = a(m) was always chosen to have 
the same additive order o(m) of the non-zero element m in (M, +). 


Finally we have the analogue of Theorem 8.4.5, above; the proof is entirely dual 
to that of Theorem 8.4.5. 


Theorem 8.4.9 The following conditions are equivalent for the R-module I. 


(i) I is injective. 
(ii) Every short exact sequence 0 > I + M > M" - 0 splits. 
(iii) I is a direct summand of a cofree R-module. 


8.5 Exercises 


8.5.1 Exercises for Sect. 8.1 


1. Show that any abelian group is a Z-module. [Hint: Write the group operation as 
addition, and define the multiplication by integers. ] 

2. (Right modules from left modules). In the remark on p. 233, a recipe is given for 
converting a left R module into a right S-module, by the formula ms := 7(s)m, 
where m € M, a left R-module, s € S and where 7 : S — Opp(R) is a ring 
homomorphism mapping the multiplicative identity element of S to that of R. 
Show that all the axioms of a right S-module are satisfied by this rule. 

3. (Annihilators of modules as two-sided ideals.) Suppose that M is a right 
R-module. 


(a) Show that Ann(M) := {r € R|Mr = Oy} is a two-sided ideal in R. (It is 
called the annihilator of M.) 
(b) Suppose A < B < M is achain of submodules of the right R-module M. 
Show that 
(A: B):= {r € R|Br C A} 
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4. 
5. 


7. 


9. 
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is a 2-sided ideal of R. [Hint: (A : B) is the annihilator of the submodule 
B/A. Use the previous step. Note that if A and B are right ideals of R (that 
is, submodules of Rp), then Ann(B/A) is just the residual quotient (A : B) 
defined on p. 196.] 


Prove Corollary 8.1.7. [Hint: Establish the criterion of Lemma 8.1.6.] 
(Finite generation) 


(a) Let (Q, +) be the additive group of the field of rational numbers. Being an 
abelian group, this group is a Z-module by Exercise (1) in Sect. 8.5.1. Show 
that (Q, +) is not finitely generated as a Z-module. [Hint: Pay attention to 
denominators. | 

(b) Suppose M is a finitely generated right R-module. Show that for any gen- 
erating set X for M, there is a finite subset Xq of X which also generates 
M. [Hint: By assumption there exists a finite generating set F. Write each 
member of F as a finite R-linear combination of elements of X.] 


. Let My, M1, a € A be right R-modules, and let gy : Ma > Ml, a € A be 


R-module homomorphisms. Prove that there is a unique homomorphism [] ¢a : 
[] Ma > [] Mj, making each diagram below commute: 


Mg —> M, 
Let My, M), a € A be right R-modules, and let ¢, : Ma > M!, a € Abe 


R-module homomorphisms. Prove that there is a unique homomorphism Od@yq : 
@ Ma > @ M/, making each diagram below commute: 


Mg er M;, 


Ha | #3 | 


Boa 
Deed Ma —— BacA Mi 


. Prove, or provide a counterexample to the following statement. Any submodule 


of a free module must be free. 
Prove that the direct sum of free R-modules is also free. 
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8.5.2 Exercises for Sects. 8.2 and 8.3 


1. Show that any submodule or factor module of a Noetheriean (Artinian) module 
is Noetherian (Artinian). [Hint: This is Sect. 2.3, Corollary 2.3.7.] 

2. Suppose M = Ni +---+ Nz, asum of finitely many submodules N;. Show that 
if each submodule N; is Noetherian (Artinian) then M is Noetherian (Artinian). 
[Hint: This is Sect.2.5.4, Lemma 2.5.7, part (i).] 

3. Show that if V is a submodule of M then M is Noetherian (Artinian) if and only 
if both N and M/N are Noetherian (Artinian). [Hint: Isn’t this just the content of 
Lemma 2.5.6 of Sect. 2.5.47] 

4. Show that if M/A; is a Noetherian (Artinian) right module for a finite collection of 
submodules A1,...Ay, then so is M/(A,N---OA,). [Hint: Refer to Sect. 2.5.4, 
Lemma 2.5.7, part (i1).] 

5. Let R be a ring. Suppose the R-module Rp is Noetherian (Artinian)—that is, 
the ring R is right Noetherian (right Artinian). Show that any finitely generated 
right R-module is Noetherian (Artinian). [Hint: Such a right module M is a 
homomorphic image of a free module F = x; R ®---@x,R over a finite number 
of generators. Then apply other parts of this exercise. ] 

6. Suppose the right R-module M satisfies the ascending chain condition on 
submodules. Then every generating set X of M contains a finite subset that 
generates M. 

7. Let F be a field and let V = F“™), the vector space of n-tuples of elements 
of F, where n is a positive integer. Recall that an n-tuple a := (aj,..., dy) is 
a zero of the polynomial p(x\,...,Xn) if and only p(a1,...,dn) = 0 (more 
precisely, the polynomial p is in the kernel of the evaluation homomorphism 
€q : F[x,..-,Xn] > F (see p. 209)). 


(a) Show that if {p1, p2, ...} 1s an infinite set of polynomials in the polynomial 
ring F[x,,...,X,], and V; is the full set of zeros of the polynomial p;, then 
the set of common zeros of the pj—that is the intersection 1V;—is actually 
the intersection of finitely many of the Vj. 

Show that for any subset X of V there exists a finite set Sy := {pi,... px} 
polynomials such that any polynomial which vanishes on X is an F-linear 
combination of those in the finite set F. 

Recall that a variety is the set of common zeros in V of some collection 
of polynomials in F[x1,...,X,]. The varieties in V form a poset under the 
inclusion relation (see p. 209). Show that the poset of varieties of V possesses 
the descending chain condition. [Hint: All three parts exploit the Hilbert Basis 
Theorem. | 


(b 


wm 


= 
OQ 
wm 
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8.5.3 Exercises for Sect. 8.4 


1. Let M be a left R-module and let A be an abelian group. Show that the 
right R-scalar multiplication on Homz(M, A) defined by setting (fr)(m) = 
f(rm) € A, gives Homz(M, A) the structure of a right R-module. If A 
is also a left R-module is the subgroup Hompr(M, A) an R-submodule of 
Homz(M, A)? 

2. Let R, S be rings. Recall from p. 234 that an (R, S)-bimodule is an abelian 
group M having the structure of a left R-module and the structure of a right 
S-module such that these scalar multiplications commute, 1.e., r(ms) = (rm)s 
for allm € M, r € Rands € S.4 Now assume that M is an (R, S)-bimodule 
and that N is a right S-module. Show that one may give Homs(M, N) the 
structure of a right R-module by setting (fr)(m) = f(rm),r € R,m é€ 
M, f © Homs(M, N). 

3. Let M be aright R-module. Interpret and prove: 


Hompr(R, M) =r M. 


4. Show that the “functorially induced mapping” of Eq. (8.24) is a homomorphism 
as left R-modules. [Hint: the only real point is to verify the equation 


Bo(rd)oa=r(Gogoa) 
using the left R-action defined for these modules on p. 261 preceding the 
equation. ] 


5. Let R be a ring, let M be a right R-module, and let A be an abelian group. 
Interpret and prove: 


Homr(M, Homz(R, A)) =z Homz(M, A). 
6. Let R be an integral domain in which every ideal is a free R-module. Prove that 
R is a principal ideal domain. 
7. (a) Let M, Ny and N> be right R-modules. Show that 
Hom(M, N; @ Nz) =z Hom(M, N,) 6 Hom(M, No). 


More generally, show that if {Nq, a € A} is a family of right R-modules, 


then 
Hom ( Pp Ne) ~7 ae Hom(M, No). 


acA aceA 


4Perhaps the most important example of bimodules occur as follows: if R is a ring and if S C R is 
a subring, then R is an (R, S)-bimodule. 
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(b) Let N, M, and M2 be right R-modules and show that 
Hom(M, x Mz, N) =z Hom(M;, N) x Hom(M), N). 
More generally, show that if {M,, a € A} is a family of R-modules, then 
nom( TT Ma, v) ~z, | | Hom(M,, N). 
acA acA 


(c) Show that if 


1 be 


03 N SNSN SO 


is a split short exact sequence of R-modules, and if M is a fixed right 
R-module, then the sequences 


H ly. 2 
0-> Home, wy Ph S” Hema, wy) OE” Hom. IN") — 0, 


and 


Hi Jl a 
0 Hom(w”, M) 2" Homcn, M) 2S Hom’, M) > 0, 


are both exact (and also split). 

8. Show that the sequences which appear in Theorem 8.4.3 exhibiting the left or 
right exactness of the “hom’ functors” are morphisms as left R-modules, not 
just abelian groups. 

9. LetO > M’ > M > M" > Obeashort exact sequence of right R-modules. 
Prove that M is Noetherian (resp. Artinian) if and only if both M’, M” are. [Of 
course, this is just a restatement of Exercise (3) in Sect. 8.5.2] 

10. Prove that if Ag, a € A, is a family of abelian groups, then 


Homz (« | 7) a I] Homz(R, Aq). 


acA acA 


11. Let Py, a € Abe a family of right R-modules. Show that G) Po is projective 
ae 
if and only if each Py is projective. 
12. Let Iy,a € Abe a family of right R-modules. Show that [] Ja is injective if 
acA 
and only if each J, is injective. 


13. Let P be an R-module. Prove that P is projective if and only if given any exact 


sequence 0 — M’ “ M - M" — 0, the induced sequence 
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15. 


16. 


17. 


18. 


19. 


20. 


21. 
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Hom(1 Hom(1 
03 Hemp ay nome a Bonet a0 
is exact. ; 
Suppose we have a sequence 0 + M’ 4M - M” > Oof R-modules. Prove 


that this sequence is exact if and only if the sequence 


Homie; LL) Hom(T?; €) 


0 — Hom p(P, M’) Homr(P, M) Homr(P, M") > 0 


is exact for every projective R-module P. 
Let J be an R-module. Prove that J is injective if and only if given any exact 


sequence 0 — M’ ". M —> M" = 0, the induced sequence 


H 1 H 1 
0 —> Home”, 1) 2S" Homa, 1) "4S"? 


Homr(M’, 1) > 0 
is exact. ; 

Suppose we have a sequence 0 + M’ ‘ M -S M" - Oof R-modules. Prove 
that this sequence is exact if and only if the sequence 


Home, 17) Hom(i 17) 


0 — Homr(M", 1) Homer(M, 1) Homr(M’, 1) > 0 


is exact for every injective R-module /. 
Give an example of a non-split short exact sequence of the form 


OoO- P>->M-I-0 


where P is a projective R-module and where / is an injective R-module. 
Let F be a field and let R = be the ring of x n lower-triangular matrices over 
F.. For each m < n the F-vector space 


Lin = {Lay a2 +++ Am O--- O] | apag,...,Am € F} 


is aright R-module. Prove that each L,,, 1 < m <n isa projective R-module, 
but that none of the quotients L;/L;, 1 < j < k < nis projective. 

A ring for which every ideal is projective is called a hereditary ring. 

Prove that if F is a field, then the ring M,(F) of n x n matrices over F is 
hereditary. The same is true for the ring of lower triangular n x n matrices 
over F’. 

Let A be an abelian group and let B < A be such that A/B is infinite cyclic. 
Prove that A = A/B x B. 

Let A be an abelian group and assume that A = H x C, = K x C2 where C 
and C4 are infinite cyclic groups. Prove that H = K. [Hint: First H/(HN K) = 
HK/K < A/K = C2 so H/(H 2 K) is either trivial or infinite cyclic. Similarly 
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for K/(H 0 K). Next AH K) = H/I(H OK) x Cy and A(HN K) = 
K/I(H( K) x C2 so H/(H( K) and K/(H 1K) are either both trivial (in which 
case H = K) or both infinite cyclic. Thus, from the preceding Exercise (20) in 
Sect.8.5.3 obtain H = A/(HNK)xX HOK=KI(ANK)XHOK=K, 
done. ] 


22. Prove Baer’s Criterion: Let I be a left R-module and assume that for any left 


ideal J C Randany R-module homomorphisma, : J > J, a extends to an R- 
module homomorphism a : R — J. Show that J is an injective module. [Hint: 
Let M’ C M be R-modules and assume that there is an R-module homomor- 
phism a: M’ + I. Consider the poset of pairs (V, ay), where M’ C NC M 
and where ay extends a. Apply Zorn’s Lemma to obtain a maximal element 
(No, ao). If No 4 M, letm © M — No and let J = {r € R| rm € No}; note 
that J is a left ideal of R. Now what?] 


Reference 
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Chapter 9 
The Arithmetic of Integral Domains 


Abstract Integral domains are commutative rings whose non-zero elements are 
closed under multiplication. If each nonzero element is a unit, the domain is called 
a field and is shipped off to Chap. 11. For the domains D which remain, divisibility 
is a central question. A prime ideal has the property that elements outside the ideal 
are closed under multiplication. A non-zero element a € D is said to be prime if the 
principle ideal Da which it generates is a prime ideal. D is a unique factorization 
domain (or UFD) if any expression of an element as a product of prime elements 
is unique up to the order of the factors and the replacement of any prime factor by 
a unit multiple. If D is a UFD, so is the polynomial ring D[X] where X is a finite 
set of commuting indeterminates. In some cases, the unique factorization property 
can be determined by the localizations of a domain. Euclidean domains (like the 
integers, Gaussian and Eisenstein numbers) are UFD’s, but many domains are not. 
One enormous class of domains (which includes the algebraic integers) is obtained 
the following way: Suppose K a field which is finite-dimensional over a subfield F 
which, in turn, is the field of fractions of an integral domain D. One can then define 
the ring O p(K) of elements of K which are integral with respect to D. Under modest 
conditions, the integral domain O p(K), will become a Noetherian domain in which 
every prime ideal is maximal—a so-called Dedekind domain. Although not UFD’s, 
Dedekind domains offer a door prize: every ideal can be uniquely expressed as a 
product of prime ideals (up to the order of the factors, of course). 


9.1 Introduction 


Let us recall that an integral domain is a species of commutative ring with the 
following simple property: 


(ID) If D is an integral domain then there are no zero divisors—that is, there is no 
pair of non-zero elements whose product is the zero element of D. 


Of course this is equivalent (in the realm of commutative rings) to the proposition 


(ID’) The set D* of non-zero elements of D, is closed under multiplication, and 
thus (D*, -) is a commutative monoid under ring multiplication. 
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As an immediate consequence one has the following: 


Lemma 9.1.1 The collection of all non-zero ideals of an integral domain are closed 
under intersection, and so form a lattice under either the containment relation or its 
dual. Under the containment relation, the ‘meets’ are intersections and the ‘joins’ 
are sums of two ideals. 


Proof The submodules of Dp form a lattice under the containment relation with 
‘sum’ and ‘intersection’ playing the roles of and “join” and “meet”. Since such 
submodules are precisely the ideals of D, it remains only to show that two nonzero 
ideals cannot intersect at the zero ideal in an integral domain. But if A and B are 
ideals carrying non-zero elements a and b, respectively, then ab is anon-zero element 
of AN B. 


We had earlier introduced integral domains as a class of examples in Chap.6 on 
basic properties of rings. One of the unique and identifying properties of integral 
domains presented there was: 


(The Cancellation Law) If (a, b,c) € D* x D x D, then ab = ac implies b = c. 


It is this law alone which defines the most interesting aspects of integral domains— 
their arithmetic—which concerns who divides who among the elements of D. (It 
does not seem to be an interesting question in general rings with zero divisors—such 
as full matrix algebras.) 


9.2 Divisibility and Factorization 
9.2.1 Divisibility 


One may recall from Sect. 7.1.3 that the units of a ring R form a multiplicative group, 
denoted U(R). Two elements a and b of a ring are called associates if a = bu for 
some unit u € U(R). Since U(R) is a group, the relation of being an associate, is an 
equivalence relation. This can be seen in the following way: the group of units acts by 
right multiplication on the elements of the ring R. Two elements are associates if and 
only if they belong to the same U(R)-orbit under this action. Since the U(R)-orbits 
partition the elements of R, the property of belonging to a common orbit is clearly 
an equivalence relation on the set of ordered pairs R x R. We call these equivalence 
classes association classes. 

Equivalence relations are fine, but how can we connect these up with “divisibility”’? 
To make the notion precise, we shall say that element a divides element b if and only 
if b = ca for some element c in R. Notice that this concept is very sensitive to the 
ambient ring R which contains elements a and b, for that is the “well” from which 
a potential element c is to be drawn. By this definition, 
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e Every element divides itself. 
e Zero divides only zero, while every element divides zero. 
e If element a divides b and element b divides c, then a divides c. 


Thus the divisibility relationship is transitive and so we automatically inherit a 
pre-order which is trying to tell us as much as possible about the question of who 
divides whom. 

AS one may recall from the very first exercise for Chap. 2 (Sect. 2.7.1), for every 
pre-order, there is an equivalence relation (that of being less-than-or-equal in both 
directions) whose equivalence classes become the elements of a partially ordered 
set. Under the divisibility pre-order for an integral domain, D, equivalent elements 
a and b should satisfy 

a=sb, and b=ta. 


for certain elements s and ¢ in D. If either one of a or D is zero, then so is the other, 
so zero can only be equivalent to zero. Otherwise, a and b are both non-zero and so, 
by the Cancellation Law, 


a= s(ta) = (st)a and st=1. 


In this case, both s and t are units. That means a and b are associates. 

Conversely, if a and b are associates, they divide one another. 

Thus in an integral domain D, the equivalence classes defined by the divisibility 
preorder, are precisely the association classes of D—that is, the multiplicative cosets 
xU(D) of the group of units in the multiplicative monoid D* := (D — {0}, -). 

The resulting poset of association classes of the integral domain D is called the 
divisibility poset and is denoted Div(D). Moreover: 


Lemma 9.2.1 (Properties of the divisibility poset) 


1. The additive identity element 0p, is an association class that is a global maximum 
of the poset Div(D). 

2. The group of units U(D) is an associate-class comprising the global minimum 
of the poset Div(D). 


One can also render this poset in another way. We have met above the lattice of 
all ideals of D under inclusion—which we will denote here by the symbol L(D). Its 
dual lattice (ideals under reverse containment) is denoted L(D)*. We are interested 
in a certain induced sub-poset of L(D)*. Ideals of the form xD which are generated 
by a single element x are called principal ideals. Let P(D) be the subposet of 
L(D) induced on the collection of all principal ideals of D, including the zero ideal, 
0 = OD. We call this the principal ideal poset. To obtain a comparison with the 
poset Div(D), we must pass to the dual. Thus P(D)* is defined to be the subposet 
of L* induced on all principal ideals—that is, the poset of principal ideals ordered 
by reverse inclusion. 

We know the following: 
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Lemma 9.2.2 (Divisibility and principal ideal posets) Jn an integral domain D, the 
following holds: 


1. Element a divides element b if and only if bD © aD. 

2. Elements a and b are associates if and only if aD = bD. 

3. So there is a bijection between the poset P(D)* of all principal ideals under 
reverse containment, and the poset Div(D) of all multiplicative cosets of the 
group of units (the association classes of D) under the divisibility relation, given 
by the map f : xD — xU(D). 

4. Moreover, the mapping 

f : P(D)* => Div(D), 


is an isomorphism of posets. 


One should be aware that the class of principal ideals of a domain D need not be 
closed under taking intersections and taking sums. This has a lot to do—as we shall 
soon see—with the questions of the existence of “greatest” common divisors, “least” 
common multiples and ultimately the success or failure of unique factorization. 
Notice that here, the “meet” of two cosets xU(D) and yU(D) would be a coset 
dU(D) such that dU(D) divides both xU(D) and yU(D) and is the unique coset 
maximal in Div(D) having this property. We pull this definition back to the realm of 
elements of D in the following way: 


The element d is a greatest common divisor of two non-zero elements a and b of an 
integral domain D if and only 


1. d divides both a and b. 
2. If e divides both a and b, then e divides d. 


Clearly, then, if a greatest common divisor of two elements of a domain exists, 
it is unique up to taking associates—i.e. up to a specified coset of U(D)—and this 
association class is unchanged upon replacing any of the two original elements by 
any of their associates. 

In fact it is easy to see that if d is the greatest common divisor of a and b, then 
dD is the smallest principal ideal containing both aD and bD. Note that this might 
not be the ideal aD + bD, the join in the lattice of all ideals of D. 

Similarly, lifting back the meaning of a “join” in the poset Div(D), we say that 
the element m is a least common multiple of two elements a and b in an integral 
domain D if and only if: 


1. Both a and b divide m (i.e. m is a multiple of both a and b). 
2. If is also acommon multiple of a and 5, then n is a multiple of m 


Again, we note that if m is the least common multiple of a and b, then mD is the 
largest principal ideal contained in both aD and bD. Again, this does not mean that 
the ideal mD is aD M bD, the meet of aD and bD in the full lattice Z of all ideals 
of D. 

The above discussion calls attention to the special case in which the principal 
ideal poset P(D) actually coincides with the full lattice of ideals L(D). An integral 
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domain D for which P(D) = L(D) is called a principle ideal domain, or PID for 
short. From our discussion in the preceding paragraphs we have: 


Lemma 9.2.3 For any two elements a and b of aprincipal ideal domain D, a greatest 
common divisor d = gcd(a, b) and a least common multiple m = Icm(a, b) exist. 
For the least common multiple m, one has Dm = Da Db, the meet of Da and Db 
in the lattice L(D). Similarly the greatest common divisor d generates the join of 
the ideals spanned by a and b. That is, 


Da+ Db = Da. 


Thus there exist elements s and t in D, such that d = sa + tb. 


9.2.2 Factorization and Irreducible Elements 


The next two lemmas display further instances in which a question on divisibility in 
an integral domain, refers back to properties of the poset of principal ideals. 

A factorization of an element of the integral domain D is simply a way of repre- 
senting it as a product of other elements of D. Of course one can write an element 
as a product of a unit and another associate in as many ways as there are elements 
in U(D). The more interesting factorizations are those in which none of the factors 
are units. These are called proper factorizations. 

Suppose we begin with an element x of D. If it is a unit, it has no proper factor- 
ization. Suppose element x is not a unit, and has a proper factorization x = x;y into 
two non-zero non-units. We attempt, then, to factor each of these factors into two 
further factors. If such a proper factorization is not possible for one factor, one pro- 
ceeds with the other proper factor. In this way one may imagine an infinite schedule 
of such factorizations. This schedule would correspond to a downwardly growing 
binary tree in the graph of the divisibility poset Div(D), An end-node to this tree 
(that is, a vertex of degree one) results if and only if one obtains a factor y which 
possesses no proper factorization. We call such an element “irreducible”. 

Precisely, an element y of D* is irreducible if and only if it is a non-zero non-unit 
with the property that if y = ab is a factorization, one of the factors a or b, is a 
unit. Applying the poset isomorphism f : Div(D) > P(D)* given in Lemma 9.2.2 
above, these comments force the following: 


Lemma 9.2.4 /f D is an integral domain, an element y of D is an irreducible 
element if and only if the principal ideal yD is non-zero and does not properly lie in 
any other principal ideal except D itelf (that is, it is maximal in the induced poset 
P(D) — {D, 0} of proper principal ideals of D under containment). 


Here is another important central result of this type which holds for a large class 
of integral domains. 
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Lemma 9.2.5 Let D be an integral domain possessing the ascending chain condition 
on its poset P(D) of principal ideals. Then every non-zero non-unit is the product 
of a unit and a finite number of irreducible elements. 


Proof Suppose there were a non-zero element x € D — U(D) which was not the 
product of a finite number of irreducible elements. Then the collection 7 of principal 
ideals xD where x is not a product of finitely many irreducible elements is non- 
empty. We can thus choose an ideal yD which is maximal in T because T inherits 
the ascending chain condition from the poset of principal ideals. Since y itself is not 
irreducible, there is a factorization y = ab where neither a nor b is a unit. Then aD 
properly contains yD, since otherwise aD = yD and y is an associate ua of a where 
u is a unit. Instantly, the cancellation law yields b = u contrary to hypothesis. Thus 
certainly the principal ideal aD does not belong to J. Then a is a product of finitely 
many irreducible elements of D. Similarly, b is a product of finitely many irreducible 
elements of D. Hence their product y must also be so, contrary to the choice of y. 
The proof is complete. 


9.2.3 Prime Elements 


So far, we have been discussing irreducible elements. A similar, but distinct notion 
is that of a prime element. A non-zero non-unit r in D is said to be a prime, if and 
only if, whenever r divides a product ab, then either r divides a or r divides b. 

One of the very first observations to be made from this definition is that the notion 
of being prime is stronger than the notion of being irreducible, Thus: 


Lemma 9.2.6 Jn any integral domain, any prime element is irreducible. 


Proof Let r be a prime element of the integral domain D. Suppose, by way of 
contradiction that r is not irreducible, so that r = ab, where a and b are non-units. 
Then r divides ab, and so, as r is prime, r divides one of the two factors a or b—say 
a. Then r = rvb for some v € D. But then, by the Cancellation laws, 1 = vb, 
whence b is a unit, contrary to our assumption. 

Thus r is irreducible. 


In general, among integral domains, an irreducible element may not be prime.! 


However it is true for principle ideal domains. 


Lemma 9.2.7 In a principle ideal domain, every irreducible element is a prime 
element. 


Proof Let x be an irreducible element in the principle ideal domain D. By definition 
x is anon-unit. Assume, by way of contradiction that x is not a prime. Then there 
exists elements a and b in D, such that x divides ab, but x does not divide either 


‘An example is given under the title “a special case” on p. 290. 
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a of b. Notice that if a were a unit, then x would divide b, against our hypothesis. 
Thus it follows that neither a nor b are units. Thus Da is a proper ideal that does not 
lie in Dx. Since x is irreducible, Dx is a maximal ideal in D (Lemma 9.2.4), and is 
properly contained in the ideal Dx + Da. Thus Dx + Da = D, and so 1 = dja+d2x 
for some elements d, dz of D. But since x divides ab, we may write ab = d3x for 
d3; € D. Now 


b=b-1=b(dja+ dx) = d,ab+doxb 
= d1(d3x) + daxb = (d\d3 + dob)x. 


Thus bis a multiple of x, which is impossible, since x does not divide b by hypothesis. 
This contradiction tells us that x is indeed a prime element. 


Thus, in a principal ideal domain, the set of irreducible elements and the set of 
prime elements coincide. 
The reader will observe, the following: 


Corollary 9.2.8 In a principle ideal domain D, the element a is prime, if and only 
if the factor ring D/Da is an integral domain. 


9.3 Euclidean Domains 


An integral domain D is said to be a Euclidean Domain if and only if there is a 
function g : D* —> N into the natural numbers (non-negative integers)” satisfying 
the following: 


(ED1) For a,b € D* := D — {0}, g(ab) > g(a). 
(ED2) If (a,b) € D x D*, then there exist elements g and r such that a = bq +r, 
where either r = 0 or else g(r) < g(b). 


The notation in (ED2) is intentionally suggestive: q stands for “quotient” and r 
stands for “remainder’’. The function g is sometimes called the “grading function” 
of the domain D. 

Recall that in a ring R, an ideal J of the form x R is called a principal ideal. An 
integral domain in which every left ideal is a principal ideal is called a principal ideal 
domain or PID. We have 


Theorem 9.3.1 Every Euclidean domain is a principal ideal domain. 


Proof Let D be a Euclidean domain with grading function g : D* — N. Let J be 
any ideal in D. Among the non-zero elements of J we can find an element x with 
g(x) minimal. Let a be any other element of J. Then a = gx +r as in (ED2) where 
r = Oor g(r) < g(x). Since r = a — qx € J, minimality of g(x) shows that the 


2 As remarked several times, in this book, the term “natural numbers” includes zero. 
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second alternative cannot occur, sor = 0 anda = qx. Thus J C xD. But x € J, 
an ideal, already implies xD C J so J = xD is a principal ideal. Since J was an 
arbitrary ideal, D is a PID. 


There are, however, principle ideal domains which are not Euclidean domains. 

Clearly any Euclidean domain D possesses the algebraic property that it is a PID, 
so that greatest common divisors d = gcd(a, b) always exist along with elements s 
and ¢ such that d = sa + bt (Lemmas 9.2.3 and 9.3.1). What is really new here is 
the computional-logical property that the greatest common divisor d = gcd(a, b) as 
well as the elements s and ¢ can actually be computed! Behold! 


EUCLIDEAN ALGORITHM. Given a and b in D and D%, respectively, by (ED2) we 
have 
a=bq,+n, 


and ifr; £0, 
b=riqo+r2, 


and if r2 4 0, 
r) =1293 +73, 


etc., until we finally obtain a remainder r; = 0 in 
rk-2 = Tk-14k + Tk: 


Such a termination of this process is inevitable for g(71), g(r2),... is a strictly 
decreasing sequence of non-negative integers which can only terminate at g(rz_1) 
when r;, = 0. 

We claim the number 7,_ (the last non-zero remainder) is a greatest common 
divisor of a and b. Firstrz_2 = rp—1gx andalsor,;—3 = rp—2Gx—1+rK_1 are multiples 
of r¢—1. Inductively, if r,_2 divides both r; and r;+1, it divides r;_;. Thus all of the 
right hand sides of the equations above are multiples of r,_,; in particular, a and b 
are multiples of rz_;. On the other hand if d’ is a common divisor of a and b, d’ 
divides r} = a — bq, and in general divides r; = rj—2 — rj—1q; and so d’ divides 
rp—1, eventually. Thus d := rz_1 is a greatest common divisor of a and b. 

Also an explicit expression of d = rg_1 as a D-linear combination sa + tb can be 
obtained from the same sequence of equations. For 


d =Prp—-| =Tk-3 — Tho Qk-1; 


and each r; similarly is a D-linear combination of rj—2 and rj—1, j = 1,2,... 
Successive substitutions for the r; ultimately produces an expression d = sa + tb. 
All quite computable. 
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9.3.1 Examples of Euclidean Domains 


Example 45. The Gaussian integers and the Eisenstein integers. 

These are respectively the domains D; := Z @ Zi (where i* = —1 4 —i) 
and D2 = Z @® Zw (where w = grr ). The complex number w is a zero of the 
irreducible polynomial x* + x + 1, and so is a cube root of unity distinct from 1. 

Both of these domains are subrings of C, the complex number field, and each is 
invariant under complex conjugation. Thus both of them admit a multiplicative norm 
function N : C + Rwhere N(z) := z-z records the square of the Euclidean distance 
of z from zero in the complex plane. (As usual, z denotes the complex conjugate of 
the complex number z.) 

We shall demonstrate that these rings are Euclidean domains with the norm func- 
tion in the role of g. That its values are taken in the non-negative integers follows 
from the formulae 


Naat+b)=a+h* 
N(a+ bw) = a? —ab+b’. 


As already remarked, the norm function is multiplicative so (ED1) holds. To demon- 
strate (ED2) we choose elements a and b of D = D, or D2 with b 0. Now the 
elements of D; form a square tessellation of the complex plane with {0, 1, i, i + 1} 
as a fundamental square. Similarly, the elements of D2 form the equilateral triangle 
tessellation of the complex plane with {0, 1, —w?*} or {0, 1, —w} at the corners of the 
fundamental triangles. We call the points where three or more tiles of the tessellation 
meet lattice points. 

When we superimpose the ideal bD on either one of these tessellations of the 
plane, we are depicting a tessellation of the same type on a subset of the lattice 
points with a possibly larger fundamental tile. Thus for the Gaussian integers, bD 
is a tessellation whose fundamental square is defined by the four “corner” points 
{0, b, ib, (1 + 1)b}. The resulting lattice points are a subset of the Gaussian integers 
closed under addition (vector addition in the geometric picture). Similarly for the 
Eisenstein numbers, a fundamental triangle of the tessellation defined by the ideal 
bD is the one whose “corners” are in {0, b, —bw*} or {(0, b, —bw)}. In either case 
the tiles of the tessellation bD cover the plane, and so the element a must fall in some 
tile T— either a square or an equilateral triangle—of the superimposed tessellation 
bD. Now let qb be a corner point of T which is nearest a. Since T is a square or an 
equilateral quadrangle, we have 


(Nearest Corner Principle) The distance from a point of T to its nearest corner is 
less than the length of a side of T 


Thus the distance |a — qb| = / N(a — qb) is less than |b| = ./N(b). Using the 
fact that the square function is monotone increasing on non-negative real numbers, 


we have 
g(a — bq) = N(a — bq) < N(b) = gd). 
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Thus setting r = a — gb, we have r = 0 or g(r) < g(b). Thus (with respect to the 
function g), r serves to realize the condition (ED2).° 

Before passing to the next example, this may be a good place to illustrate how 
embedding one domain into another sometimes may provide new results for the 
original domain: in this case we are embedding the integers Z into the Gaussian 
integers Z[i]. When we select a prime integer (called a rational prime), the question 
is raised whether it is still a prime element in the ring of Gaussian integers (a Gaussian 
prime). The Theorem 9.3.3 below gives a complete answer. 


Lemma 9.3.2 Let p be any rational prime. If p is not a Gaussian prime then p is 
the sum of two integer squares. 


Proof Suppose that rational prime p is the product of two non-units in Z[i], say 
p = (¢n. Taking norms, we have p? = N(¢)N(n). Since a Gaussian integer is a 
unit if and only if its norm is +1, it follows that the last two norms of the previous 
equation are both positive integers dividing p?. Since p is a rational prime, each of 
these norms is equal to p. Thus, writing ¢ = a+bione has p = N(¢) = a*+b*. 


Theorem 9.3.3. Let p be a rational prime. Then p is a Gaussian prime if and only 
if it leaves a remainder of 3, when divided by 4. 


Proof First, the prime integer 2 is not a Gaussian prime since 2 = (1 + i)(1 — i), 
sO We may assume that the rational prime p is odd. Since the square of every odd 
integer leaves a remainder of 1, when divided by 4, the sum of two integer squares 
can only be congruent to 0, 1, or 2 modulo 4. Thus if p = 3 mod 4, it is not the 
sum of two integer squares and so must be a Gaussian prime, by the previous Lemma 
9.3.2. 

That leaves the case that p = 1 mod 4. In this case Z/(p) is a field whose 
multiplicative group of non-zero elements contains a unique cyclic subgroup of order 
4 whose generator has a square equal to —1, that is, there exists an integer b such 
that b> = —1 mod p. Let P be the principle ideal generated by p in the domain 
of Gaussian integers. Since Z[i] is Euclidean, it is a principle ideal domain, and so, 
by Corollary 9.2.8, the factor ring Z[i]/P is an integral domain if and only if p is a 
prime in Z[i]. But since b 40 mod p, the numbers | + bi cannot be multiples of 
p in Z[i]. Thus the equation 


(1+bi+ P)A—bi+ P)=U+b)0 —b) +P =04+02)4+P=P, 


reveals that the factor ring Z[i]/P possesses zero divisors, and so cannot be an 
integral domain. Accordingly p is not a prime element of Zi]. 


3Usually boxing matches are held (oxymoronically) in square “boxing rings”. Even if (as well) 
they were held in equilateral triangles, it is a fact that when the referee commands a boxer to “go 
to a neutral corner’, the fighter really does not have far to go (even subject to the requirement of 
“neutrality” of the corner). But woe the fighter who achieves a knock-down near the center, of a 
more ring-like N-gon for large N. After the mandatory hike to a neutral corner, the fighter can only 
return to the boxing match totally exhausted from the long trip. No Euclidean algorithm for these 
boxers. 
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Corollary 9.3.4 [f p is a rational prime of the form 4n + 1, then p is a sum of two 
integer squares. 


Proof By Theorem 9.3.3 p is not a Gaussian prime. Now apply Lemma 9.3.2. 


This result does not seem easy to prove within the realm of integers alone. 


Example 46 Polynomial domains over a field. Let F be a field and let F [x] be the 
ring of polynomials in indeterminate x and with coefficients from F’. We let deg be 
the degree function, F[x] — N 4 Then if Ff (x) and g(x) are non-zero polynomials, 


deg( fg) = deg f + deg g, 


so (ED1) holds. The condition (ED2) is the familiar long division algorithm of 
College Algebra and grade school. 


9.4 Unique Factorization 


9.4.1 Factorization into Prime Elements 


Let D be any integral domain. Recall that a non-zero non-unit of D is said to be 
irreducible if it cannot be written as the product of two other non-units. This means 
that if r = ab is irreducible, then either a or D is a unit. 

Recall also that a non-zero non-unit r in D is said to be a prime element, if and 
only if, whenever r divides a product ab, then either r divides a or r divides b. 

In the case of the familiar ring of integers (indeed for all PID’s), the two notions 
coincide, and indeed they are forced to coincide in an even larger collection of 
domains. 

An integral domain D is said to be a unique factorization domain (or UFD) if and 
only if 


(UFD1) Every non-zero non-unit is a product of a unit and a finite number of 
irreducible elements. 
(UFD2) Every irreducible element is prime. 


We shall eventually show that a large class of domains—the principal ideal 
domains—are UFD’s. (See Theorem 9.4.3.) One of the very first observations to 
be made from these definitions is that the notion of being prime is stronger than the 
notion of being irreducible. Thus: 


4The student will recall the familiar degree function, that records the highest exponent of x that 
appears when a polynomial is expressed as a linear combination of powers of x with non-zero 
coefficients. 
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Lemma 9.4.1 The following are equivalent: 


(i) Disa UFD. 
(ii) In D every non-zero non-unit is a product of a unit and a finite number of primes. 


Proof (i) => (ii) is obvious since (1) implies each irreducible element is a prime. 
(ii) = (4). First we show that every irreducible element is prime. Let r be irre- 
ducible. Then by (ii), 7 is a product of a unit and a finite number of primes. But since 
r is irreducible, r = up where u is a unit and p is a prime. 
Now every non-zero non-unit is a product of a unit and finitely many irreducible 
elements, since each prime is irreducible by Lemma 9.2.6. Thus the two defining 
properties of a UFD follow from (ii) and the proof is complete. 


Theorem 9.4.2 (The Basic Unique Factorization Theorem) Let D be an integral 
domain which is a UFD. Then every non-zero non-unit can be written as a product 
of finitely many primes. Such an expression is unique up to the order in which of the 
prime factors appear, and the replacement of any prime by an associate—that is, a 
multiple of that prime by a unit. 


Proof Let r be a non-zero non-unit of the UFD D. Then by Lemma 9.4.1, 7 can be 
written as a unit times a product of finitely many primes. Suppose this could be done 
in more than one way, say, 


r= Up) P2°** Ps = VG1q2°°° 4t 


where uv and v are units and the p; and gq; are primes. If s = 0 or t = O, then r 
is a unit, contrary to the choice of r. So without loss of generality we may assume 
0 < s < t. Now as p is a prime, it divides one of the right hand factors, and this 
factor cannot be v. Rearranging the indexing if necessary, we may assume py, divides 
qi. Since q is irreducible, p; and q are associates, so gq; = uv; pi. Then 


r= Up,-+: Ps = (vu) pig2- ++ 4 


so, by the cancellation law 


UPy +++ Ps = (VU )g2-+* 4, 


with t — | factors on the right side. By induction on t, we have s = ¢ and qj = 
Ui Pri) for some unit uj, i = 2,...f and permutation 7 of these indices. Thus the 
two factorizations involve the same number of primes with the primes in one of the 
factorizations being associates of the primes in the other, written in some possibly 
different order. 
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A Special Case 


Of course we are accustomed to the domain of the integers, which is a UFD. So it 
might be instructive to look at a more pathological case. 

Consider the ring D = {a + b\/—5Sla,b € Z}. Being a subring of the field of 
complex numbers C, it is an integral domain. 

The mapping z —> Z, where z is the complex conjugate of z, is an automorphism 
of C which leaves D invariant (as a set) and so induces an automorphism of D. We 
define the norm of a complex number ¢ to be N(¢) := ¢¢. Then N : C > Rt, 
the positive real numbers, and N(¢w) = N(¢C)N(w) for all ¢,W € C. Clearly, 
N(a + bV—5) = a* + 5b’ so the restriction of N to D has non-negative integer 
values. (Moreover, we obtain for free the fact that the integers of the form a? + 5b? 
are closed under multiplication.) 

Let us determine the units of D. Clearly if v € D is a unit, then there is a win D 
so that vj: = 1 = 1 + 0./—5, the identity of D. Then 


NW) N(u) = Np) = NO) = 4+5-0 =1. 
But how can the integer | be expressed as the product of two non-negative integers? 


Obviously, only if N(v) = 1 = N(). But if a* + 5b? = 1, itis clear that b = 0 and 
a = +1. Thus 


U(D) = {+1} = {d € D|N(d) = I}. 


We can also use norms to locate irreducible elements of D. For example, if ¢ is an 
element of D of norm 9, then ¢ is irreducible. For otherwise, one would have ¢ = @7 
where ~ and 7) are non-units. But that means N (yw) and N (7) are both integers larger 
than one. Yet 9 = N(w)N(7) so N(wW) = N(n) = 3, which is impossible since 3 is 
not of the form a? + 5b?. 

But now note that 


9=3-3=04/-5)0-—/-5) 


are two factorizations of 9 into irreducible factors (irreducible, because they have 
norm 9) and, as U(D) = {+1}, 3 is not an associate of either factor on the right 
hand side. 

Thus unique factorization fails in the domain D = Z @ Z,/—5. The reason is 
that 3 is an irreducible element which is not a prime. It is not a prime because 3 
divides (2 + /—5)(2 — /—5), but does not divide either factor. 
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9.4.2 Principal Ideal Domains Are Unique Factorization 
Domains 


The following theorem utilizes the fact that if an integral domain possesses the 
ascending chain condition on principal ideals, then every element is a product of 
finitely many irreducible elements (Lemma 9.2.5). 


Theorem 9.4.3 Any PID is a UFD. 


Proof Assume D is a PID. Then by Theorem 8.2.11 of Chap. 8, D has the ascending 
chain condition on all ideals, and so by Lemma 9.2.5, the condition (UFD1) holds. 

It remains to show (UFD2), that every irreducible element is prime. Let x be 
irreducible. Then Dx is a maximal ideal (Lemma 9.2.4). 

Now if x were not a prime, there would exist elements a and b such that x divides 
ab but x does not divide either a or b. Thus a ¢ xD, b ¢ xD, yet ab € xD. Thus xD 
is not a prime ideal, contrary to the conclusion of the previous paragraph that xD is 
a maximal ideal, and hence a prime ideal. 


Corollary 9.4.4 /f F is a field, the ring of polynomials F [x] is a UFD. 


Proof We have observed in Example 46 that F[x] is a Euclidean ring with respect 
to the degree function. Thus by Theorem 9.3.1, F[x] is a PID, and so is a UFD by 
Theorem 9.4.3 above. 


9.5 If D Is a UFD, Then so Is D[x] 


We begin with three elementary results for arbitrary integral domains D, regarded 
as subrings of the polynomial rings D[x]. 


Lemma 9.5.1 U(D[x]) = U(D). 


Proof Since D is an integral domain, degrees add in taking the products of non-zero 
polynomials. Thus the units of D[x] must have degree zero and so must lie in D. 
That a unit of D is a unit of D[x] depends only upon the fact that D is a subring of 
D{[x]. 


Lemma 9.5.2. Let p be an element of D, regarded as a subring of D[x]. If p divides 
the polynomial q(x) in D[x], then every coefficient of q(x) is divisible by p. 


Proof If p divides q(x) in D[x], then g(x) = pf (x) for some f(x) € D[x]. Then 
the coefficients of g(x) are those of f(x) multiplied by p. 


Lemma 9.5.3 (Prime elements of D are prime elements of D[x].) Let p be a prime 
element of D. If p divides a product p(x)q(x) of two elements of D[x], then either 
p divides p(x) or p divides q(x). 
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Proof Suppose the irreducible element p divides p(x)q(x) as hypothesized. Write 


= n ; i - m ; j 
pix) = Di yaix', and g(x) = Dj -0Pi* 


where a;, bj ¢ D. Suppose by way of contradiction that p does not divide either p(x) 
or q(x). Then by the previous Lemma 9.5.2, p divides each coefficient of p(x)q(x) 
while there is a first coefficient a, of p(x) not divisible by p, and a first coefficient 
bs of q(x) not divisible by p. Then the coefficient of x”** in p(x)q(x) is 


Cros = Dae subject to0 <i <nand0<j<m (9.1) 


Now if @, j) A (vs), andi+ j =r-+s, theneitheri <r or j < s, and in either 
case ajb; is divisible by p. Thus all summands a;b; in the right side of Eq. (9.1) 
except possibly a,b, are divisible by p. But p is a prime element in D that does not 
divide either a, or b,. It follows that p does not divide a,b, which would mean that 
p does not divide c,;45, against Lemma 9.5.2 and the fact that p divides p(x)q(x). 

Thus p must divide one of p(x) or q(x), completing the proof. 


We are all familiar with the way that the rational number field Q is obtained as a 
system of fractions of integers. In an identical manner, one can form a field of fractions 
F of any integral domain D. Its elements are “fractions” —that is, equivalence classes 
of pairs in D x D* with the equivalence class [n, d] containing the pair (n, d) defined 
to be the set of pairs (bn, bd) as b ranges over all nonzero elements b € D*. Addition 
and multiplication of classes are as they are for rational numbers: 


[a, b] + [c, d] = [ad + bc, bd] and [a, b] - [c, d] = [ac, bd]. 


(This an example of a localization F = Ds, of the sort studied in the next section, 
with S = D* = D — {0}.) Since D is a subdomain of the field F, D[x], the domain 
of polynomials with coefficients from D, can be regarded as a subring of F[x] by 
the device of regarding coefficients d of D as fractions d/1. We wish to compare the 
factorization of elements in D[x] with those in F[x]. 

We say that a polynomial p(x) € D[x] is primitive if and only if, whenever d € D 
divides p(x), then d is a unit. 

From this point onward, we assume that D is a UFD, so all irreducible elements 
are prime and every non-zero element is a unit or a product of prime elements. 

We now approach two lemmas that involve F'[x] where F is the field of fractions 
of D. 


Lemma 9.5.4 /f p(x) = agq(x), where p(x) and q(x) are primitive polynomials in 
D[x], anda € F, then a is a unit in D. 


Proof Since D is a UFD, greatest common divisors and least common multiples 
exist, and so there is a “lowest terms” representation of a as a fraction t/s where any 
greatest common divisor of s and ¢ is a unit. If s is itself a unit, then ¢ divides each 
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coefficient of p(x), and since p(x) is primitive, t must also be a unit. In that case 
a = s/t—st~! is aunit of D, as claimed. Otherwise, there is a prime divisor p of s. 
Then p does not divide t, and so from sp(x) = tq(x), we see that every coefficient 
of q(x) is divisible by p, against g(x) being primitive. 


Lemma 9.5.5 If f(x) € F[x], then f (x) has a factorization 


f(x) = rp) 


wherer € F and p(x) is a primitive polynomial in D[x]. This factorization is unique 
up to replacement of each factor by an associate. 


Proof We prove this in two steps. First we establish 


Step 1. There exists a scalar y such that f (x) = yp(x) , where p(x) is primitive 
in D[x]. 


Each non-zero coefficient of x! in f(x) has the form a;/b; with bj € D — {0}. 
Multiplying through by a least common multiple m of the b; (recall that lcm’s exist 
in UFD’s), we obtain a factorization 


1 i 
F(x) = (YD ai(m/bj)x", 


whose second factor is clearly primitive in D[x]. 
Step 2. The factorization in Step 1 is unique up to associates. 
Suppose f(x) = y1pi(x) = y2p2(x) with y; € F, and p;(x) primitive in D[x]. 


Then p; (x) and p(x) are associates in F[x] so 


pi(x) = yp2(x), fory € F. 


Then by Lemma 9.5.4, y is a unit in D. But as y = ny! the result follows. 


A non-zero element of an integral domain D’ is said to be reducible simply if it 
is not irreducible—i.e. it has a factorization into two non-units of D’. 


Lemma 9.5.6 (Gauss’ Lemma) Suppose p(x) € D[x] is reducible in the ring F(x]. 
Then p(x) is reducible in D[x]. 


Proof By hypothesis p(x) = f(x)g(x), where f(x) and g(x) are polynomials of 
positive degree in F'[x]. Then by Step | of the proof of Lemma 9.5.5 above, we may 
write 


f(x) = fi) 
gx) = yg). 
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where 44, y € F and f(x) and g1(x) are primitive polynomials in D[x]. Now by 
Lemma 9.5.3, fi (x)gi (x) is a primitive polynomial in D[x] so 


PO) = UNA) )), 
and Lemma 9.5.4 shows that jy is a unit in D. Then 
P(x) = (Hy) A) - 1) 


is a factorization of p(x) in D[x], with factors of positive degree. The conclusion 
thus follows. 


Theorem 9.5.7 If D is a UFD, then so is D[{x]. 


Proof Let p(x) be any element of D[x]. We must show that p(x) has a factorization 
into irreducible elements which is unique up to the replacement of factors by asso- 
ciates. Since D is a unique factorization domain, a consideration of degrees shows 
that it is sufficient to do this for the case that p(x) is primitive of positive degree. 

Let S$ = D — {0} and form the field of fractions F = Ds, regarding D[x] 
as a subring of F[x] in the usual way. Now by Corollary 9.4.4, F[x] is a unique 
factorization domain. Thus we have a factorization 


P(x) = pix) p2(x) +++ Pax) 


where each p;(x) is irreducible in F[x]. Then by Lemma 9.5.5, there exist scalars 
yi,i = 1,...,n, such that p;(x) = yiqi(x), where q;(x) is a primitive polynomial 
in D[x]. Then 

P(x) = (192 +++ Yn q1 &) +++ Gn (X). 


Since the q; (x) are primitive, so is their product (Lemma 9.5.3). Then by Lemma 9.5.4 
the product of the 7; is a unit u of D. Thus 


P(x) = ugy(x)--- dn(x) 


is a factorization in D[x] into irreducible elements. 
If 


P(x) = ury(X) +++ Tn (X) 


were another such factorization, the fact that F[x] is a PID and hence a UFD shows 
that m = n and the indexing can be chosen so that 7; (x) is an associate of g;(x) in 
F[x]—.e. there exist scalars p; in F such that r;(x) = p;q;(x), for all 7. Then as 
rj(x) and q;(x) are irreducible in D[x], they are primitive, and so by Lemma 9.5.4, 
each 7; is a unit in D. Thus the two factorizations of p(x) are alike up to replacement 
of the factors by associates in D[x]. The proof is complete. 
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Corollary 9.5.8 If D is a UFD, then so is the ring 
D[x1,..., Xn]. 


Proof Repeated application of Theorem 9.5.7 to 


D[x1,...,xj41] = (Di, ..., x) [xj41]. 


9.6 Localization in Commutative Rings 


9.6.1 Localization by a Multiplicative Set 


Section 9.2 revealed that the divisibility structure of D, as displayed by the divisibility 
poset Div(D), is in part controlled by the group of units. It is then interesting to know 
that the group of units can be enlarged by a process described in this section.° 

We say that a subset S of a ring is multiplicatively closed if and only if SS C S and 
S does not contain the zero element of the ring. If S is a non-empty multiplicatively 
closed subset of the ring R, we can define an equivalence relation “~” on R x S by 
the rule that (a, s) = (b, t) if and only if, for some element u in S, u(at — bs) = 0. 

Let us first show that the relation “~” is truly an equivalence relation. Obviously 
the relation “~” is symmetric and reflexive. Suppose now 


(a,s) ~ (b, t) ~ (c,r) for {r, s,t} CS. 
Then there exists elements u and v in S such that 
u(at — bs) = O and v(br — tc) = 0. 


Multiplying the first equation by vr we get ur(uat) = ur(ubs). But ur(ubs) = 
us(vbr) = us(vtc), by the second equation. Hence 


vut(ar) = vut(sc), 


so (a, 8) ~ (c,1r), since vut € S. Thus ~ is a transitive relation. 

For any element s € S, we now let the symbol a/s (sometimes written ¢) denote 
the ~—equivalence class containing the ordered pair (a, s). We call these equivalence 
classes fractions. 

Next we show that these equivalence classes enjoy a ring structure. First observe 
that if there is an element u in S such that u(as’ — sa’) = 0 (i.e. (a, s) ~ (a’, s’)) then 


5When this happens, the student may wish to verify that the new divisibility poset so obtained is a 
homomorphic image of the former poset. 
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u(abs't — sta'b) = 0, so (ab, st) ~ (a‘b, s‘t). Thus we can unambiguously define the 
product of two equivalence classes—or “fractions’—by setting (a/s)-(b/t) = ab/st. 
Similarly, if (a, s) ~ (a’, s’), so that for some u € S, u(as’ — sa’) = 0, we see 
that 
u(at + sb)s't —u(a't + s’b)st = 0, 


so 
(at + bs)/s = (a’t + s'b)/s't. 


Thus ‘addition’, defined by setting 


a b at + bs 
+ . 


S t St 


’ 


is well defined, since this is also (a’t + s'b)/s’t. 
The set of all ~—classes on R x S is denoted Rs. Now that multiplication and 
addition are defined on Rs, we need only verify that 


1. (Rs, +) is an abelian group with identity element 0/s—that is, the ~—class 
containing (0,5) for any s € S. 

2. (Rg, -) is acommutative monoid with multiplicative identity s/s, for any s € S. 

3. Multiplication is distributive with respect to addition in Rs. 


All of these are left as Exercises in Sects. 9.13.1—9.13.2. The conclusion, then, is 
that 
(Rs, +, -) is a ring. 


This ring Ry is called the localization of R by S. 


9.6.2 Special Features of Localization in Domains 


For each element s in the multiplicatively closed set S, there is a homomorphism of 
additive groups 
Ws > CR, +) > CRgy +) 


given by ~;(r) = r/s. This need not be an injective morphism. Ifa/s = b/s, it means 

that there is an element u € S such that us(a — b) = 0. Now if the right annihilator 

of us is nontrivial, such an a — b exists with a 4 b, and wy is not one-to-one. 
Conversely, we can say, however, 


Lemma 9.6.1 [fs is an element of S such that no element of sS is a “zero divisor” — 
an element with a non-trivial annihilator—then ws : R — Rs is an embedding of 
additive groups. The converse is also true. In particular, we see that in an integral 
domain, where every non-zero element has only a trivial annihilator, the map sz is 
injective for eachs € S. 
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Lemma 9.6.2. Suppose S is a multiplicatively closed subset of non-zero elements of 
the commutative ring R and suppose no element of S is a zero divisor in R. 


(i) Suppose ac 4 0 and b, d are elements of S. Then in the localized ring Rs 
(5) i in R 
5) <G) Bs non-zero in Rs. 


(ii) Ifa is not a zero divisor in R, then for any b € S, a/b is not a zero divisor in 
Rs. 

(iii) If R is an integral domain, then so is Rs. 

(iv) If R is an integral domain, the mapping 4, : R — Rs is an injective homo- 
morphism of rings (that is an embedding of rings). 


Proof Part (i). Suppose ac 4 0, but that for some {b, d, s} C S, 
ac 0 
—)\(-)=-. 9.2 
(DO) . (9.2) 


Then 
(ac, bD) ~ (0, s). 


Then by the definition of the relation “~”, there is an element uw € S such that 
u(acs — ab -0) = O which implies uacs = 0. But since u and s are elements of 
S, they are not zero divisors, and so we obtain ac = 0, a contradiction. Thus the 
assumption ac # 0 forces Eq. (9.2) to be false. This proves Part (i). 

Parts (ii) and (iii) follow immediately from Part (1). 

Part (iv). Assume R is an integral domain. From the second statement of Lemma 
9.6.1, the mapping 7, : R — Rs which takes element r to element r/s is an injec- 
tive homomorphism of additive groups for each s € S$. Now put s = 1. We see 
that yw; (ab) = v1 (a)v1 (0) for all a, b € R. Thus 7 is an injective ring homomor- 
phism. 


9.6.3 A Local Characterization of UFD’s 


Theorem 9.6.3 (Local Characterization of UFD’s) Let D be an integral domain. 
Let S be the collection of all elements of D which can be written as a product of 
a unit and a finite number of prime elements. Then S is multiplicatively closed and 
does not contain zero; so the localization Ds can be formed. 

The domain D is a UFD if and only if Dg is a field. 


Proof (=>) If Disa UFD, then, by Lemma 9.4.1, S comprises all non-zero non-units 
of D. One also notes that Ds = Ds where S’ is S with the set U(D) of all units of D 
adjoined. (By Exercise (2) in Sect. 9.13.1 of this chapter, this holds for all domains.) 
Thus S’ = S — {0}. Then Dy is the field of fractions of D. 
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(<) Let D* := D — {0}, the non-zero elements of D. It suffices to prove that § 
is all non-zero non-units of D (Lemma 9.4.1). In fact, as S C D — U(D), it suffices 
to show that D* — U(D) C S. Let r be any non-zero non-unit—i.e. an element of 
D* — U(D), and suppose by way of contradiction that r ¢ S. 

Since r is anon-unit, Dr is a proper ideal of D. Then 


(Dr)s5; = {a/s|a € Dr,s € S} 


is clearly a non-zero ideal of Ds. Since Ds is a field, it must be all of Ds. But this 
means that 1/1 is an element of (Dr)s5s—i.e. one can find (b, s) € D x S, such that 


br/s =1/1. 


This means br = s so 
DrAsS #9. 


Now from the definition of S, we see that there must be multiples of r which can be 
expressed as a unit times a non-zero number s of primes. Let us choose the multiple 
so that the number of primes that can appear in such a factorization attains a minimum 
m. Thus there exists an element r’ of D such that 


if 
rr =UPp\::: Pm 


where u is a unit and the p; are primes. We may suppose the indexing of the primes 
to be such that pj,..., pq do not divide r’, while pa41,..., do divide r’. Then if 
d <™m, pa+1 divides r’ so r' = bpg,,. Then we have 


rbpg41 = Upy +: PdPd+i+** Pm 


so 
rb = up, ++ PdPd+2°** Pm (m — 1 prime factors) 


against the minimal choice of m. Thus d = m, and each p; does not divide r’. Then 
each p; divides r. Thus r = a, p, so ayr’ = up2--- pa, upon canceling p;. But 
again, as pz does not divide r’, pz divides a,. Thus a; = a2 p2 so ar’ = up3--: pa. 
As each p; does not divide r’, this argument can be repeated, until finally one obtains 
agar’ = u when the primes run out. But then r’ is a unit. Then 


r=((r)!u)pi--- pm € S. 


Thus as r was arbitrarily chosen in D* — U(D) C S, we are done. 
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9.6.4 Localization at a Prime Ideal 


Now let P be a prime ideal of the commutative ring R—that means P has this 
property: If, for two ideals A and B of R, one has AB C P, then at least one of A 
and B lie in P. For commutative rings, this is equivalent to asserting either of the 
following (see the Exercise (2) in Sect. 7.5.2 at the end of Chap. 7): 


(1) For elements a and bin R,ab € P impliesa € Porbe P. 
(2) The set R — P is a multiplicatively closed set. 


It should be clear that a prime ideal contains the annihilator of every element outside it. 

Now let P be a prime ideal of the integral domain D, and set S := D— P, which, 
as noted, is multiplicatively closed. Then the localization of D by S is called the 
localization of D at the prime ideal P.. In the literature the prepositions are important: 
The localization by S = D — P is the localization of D at the prime ideal P. 

Now we may form the ideal M := {p/s|(p,s) € P x S}in Ds for which each 
element of Ds — M, being of the form s’/s, (s, s’) € S x S, is a unit of Dg. This 
forces M to be a maximal ideal of Ds, and in fact, every proper ideal B of Ds must 
lie in it. Thus we see 


Ds has a unique maximal ideal M. (9.3) 


Any ring having a unique maximal ideal is called a local ring. 

This discussion together with Part (iv) of Lemma 9.6.2 yields 
Theorem 9.6.4 Suppose P is a prime ideal of the integral domain D. Then the 
localization at P (that is, the ring Ds where S = D — P) is a local ring that is also 
an integral domain. 


Example 47 The zero ideal {0} of any integral domain is a prime ideal. Forming 
the localization at the zero ideal of an integral domain D thus produces an integral 
domain with the zero ideal as the unique maximal ideal and every non-zero element 
a unit. This localized domain is called the field of fractions of the domain D5 (This 
standard construction was used in Sect. 9.5 in studying D[x] as a subring of Ds[x].) 


9.7 Integral Elements in a Domain 


9.7.1 Introduction 


In Sect. 8.2.5 the notion of an algebraic integer was introduced as a special instance of 
sets of field elements that are integral over a subdomain. Using elementary properties 


©One finds “field of quotients” or even “quotient field” used in place of “field of fractions” here and 
there. Such usage is discouraged because the literature also abounds with instances in which the 
term “quotient ring” is used to mean “factor ring”. 
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of finitely-generated modules and Noetherian rings, it was a fairly simple matter 
to demonstrate that sums and products of algebraic integers were again algebraic 
integers. At the beginning of the Appendix to this chapter (which the student should 
now be able to read without further preparation) these discussions are carried a 
bit further, wherein the algebraic integers contained in any quadratic extension of 
the rational field are determined. Furthermore, the factorization properties of these 
quadratic domains are addressed, with the result that these rings enjoyed varying 
degrees of good factorization properties, ranging from being Euclidean (as with the 
Gaussian integers Z[i]) through not even satisfying unique factorization (as with 
Z[V—5)). 

A little historical perspective is in order. The subject of Arithmetic is probably 
one of the first subjects that a student of Mathematics encounters. Its fundamentals 
can be taught to anyone. But its mysteries are soon apparent to even the youngest 
student. Why is any positive prime number that is one more than a multiple of four, 
the sum of the squares of two integers? Why is every positive integer expressible as 
the sum of four squares of integers? These problems have been solved. But there are 
many more unsolved problems. In fact, there is no other field of Mathematics which 
presents so many unsolved problems that could be easily stated to the man on the 
street. For example, the famous Goldbach Conjecture that asserts that 


Every even integer greater than two is the sum of two prime numbers. 


One of these questions concerns an assertion known as “Fermat’s Last Theorem”. 
According to tradition, Fermat had jotted in the margin of a book that he had a proof 
of the following theorem’ 


Suppose x, y, z is a triplet of (not-necessarily distinct) integers. If n is a positive 
integer greater than 2 , then there exists no such triplet such that 


xy? = 2", (9.4) 


That is, in the realm of integers, no nth power can be the sum of two other mth 
powers for n > 2. Perhaps the problem became more intriguing because there are 
solutions when n = 2. At any rate this problem attracted the attention of many 
great mathematicians of the 19th century. Here is a case where the solution was far 
less important than the theory that was developed to solve the problem. Our earliest 
mathematical ancestors might not have approached the problem with a full-fledged 
Galois Theory of field extensions at hand, for they knew that they were dealing with 


7The book in which he wrote this marginal note has never been found. All that we actually have is 
a third party who initiated the anecdote. So there are two schools of thought: (1) Fermat thought 
that he had a proof but must have made a mistake. Skeptics believe his proof could not only not fit 
in a margin, but that his proof (in order to be less than a thousand pages) must have been in error. 
Then there is the other view: (2) He really did prove the theorem in a relatively simple way—we 
just haven’t discovered how he did it. The authors are personally acquainted with at least one great 
living research mathematician who does not rule out the second view. That man is not willing to 
dismiss a mind (like Fermat’s) of such proven brilliance. That respect says something. 
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some “‘subring” of “integers” in these fields. (Complex numbers had been explicitly 
developed by Gauss and the idea of Gaussian integers may be seen as an anticipation 
of the general idea guiding others to this point.) So this was the foil that produced 
the concept of Algebraic Integers. 

It came as quite a revelation to early nineteenth century mathematicians that even 
in the algebraic integer domains, unique factorization of elements into prime elements 
could fail. Perhaps our intellectual history (as opposed to the political one) is simply 
the gradual divestment of unanalyzed assumptions. 

The main objects of study for the rest of this chapter will be the rings O ¢ consisting 
of the elements in the field E which are integral with respect to a subdomain D where 
E isa finite extension of F := F(D), the field of fractions of D. (The phrase “finite 
extension” means that the field FE contains F and is finite dimensional as a vector 
space over its subfield F.) In the special case that D = Z, the ring of integers (so 
that F = Q, the field of the rational numbers), the rings Og are the algebraic integer 
domains of the previous paragraph. 

Such algebraic integer domains Og include the above-mentioned quadratic 
domains (such as Z + Z/—5). Thus, while microscopically (i.e., element-wise), 
these rings may not enjoy unique factorization, they all satisfy a macroscopic ver- 
sion of unique factorization inasmuch as their ideals will always factor uniquely as 
a product of prime ideals. This can be thought of as a partial remedy to the fact that 
the rings Ox tend not to be UFDs. 


9.8 Rings of Integral Elements 


Suppose K is a field, and D is a subring of F. Then, of course, D is an integral 
domain. Recall from Sect. 8.2.5. that an element a of K is integral over D if an only 
if a is a zero of a polynomial in D[x] whose lead coefficient is 1 (the multiplicative 
identity of D and K’). Say that an integral domain D is integrally closed if, whenever 
a € F(D) and ais integral over D, then a € D. Here F(D) is the field of fractions 
of the integral domain D. 

The following is a sufficient, but not a necessary condition, for an integral 
domain to be integrally closed. 


Lemma 9.8.1 Jf D is a UFD, then D is integrally closed. 


Proof Graduate students who have taught basic-level courses such as “college alge- 
bra” will recognize the following argument. Indeed, given that D is a UFD, with field 
of fractions F’, then elements of F may be expressed in the form a = a/b, where 
a and b have no common prime factors in D. If such an element a were integral 


r . 
over D, then we would have a monic polynomial f(x) = >° ajx' € D[x] with 


f(a) = 0: thus = 
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(5) tan (Geta (f) tarne 


Upon multiplying both sides of the above equation by b” one obtains 


a + a@-iba” 44-4 ab’ a +a9b" = 0. 


r-1 
Therefore, any prime divisor p of b will divide 5° ajb’~'a' and so would divide a’. 
i=0 
Since D is a UFD, that would imply that this prime divisor p divides a, as well.® 
This is a contradiction since a and b possess no common prime factors. We are left 
with the case that b is not divisible by any prime whatsoever. In that case b is a unit, 
in which case a = a/b € D. 


Remark The above lemma provides us with a very large class of integral domains 
that are not UFDs. Indeed, Appendix will show that the quadratic domains O = A(d) 
consisting of algebraic integers in the quadratic extension Q(./d) > Q, where d is 
a square-free integer, d = 1 mod 4, have the description 


atbvd 


aa=| 5 


a,b €Z, a, b are both even or are both ou : 


This implies immediately that the proper subring Z[/d] © A(d) is not integrally 
closed and therefore cannot enjoy unique factorization. A more direct way of seeing 
this is that in the domain D = Z[/d], where d is square-free and congruent to 1 
modulo 4, the element 2 is irreducible in D (easy to show directly) but not prime, as 
2|(1 + /d)(1 — Vd) but 2 doesn’t divide either 1 +/d or 1 — Jd. 

The following lemmas essentially recapitulate much of the discussion of 
Sect. 8.2.5. 


Lemma 9.8.2 Let R C S be integral domains. 


1. If Ris Noetherian and S is a finitely-generated R-module, then every element of 
S is integral over R. 

2. If R CS CT are integral domains with T a finitely-generated S-module and S 
a finitely-generated R-module, then T is a finitely-generated R-module. 


Proof The first statement is immediate from Theorem 8.2.12 and Lemma 8.2.14. 
From the hypothesis of the second statement, T = >'4;S and S = >’s;R, 

where the summing parameters i and j have finite domains. Thus T = pa jtisjR 

is generated as an R module by the finite set {7,5 ;}. 


Note that R need not be Noetherian in part 2 of the above Lemma. 


8 word of caution is in order here. If D is not a UFD, it’s quite possible for an element—even 
an irreducible element—to divide a power of an element without dividing the element itself. See 
Exercise (5) in Sect.9.12, below. 
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Lemma 9.8.3. Suppose K is a field containing the Noetherian integral domain D. 
Then we have the following: 


I. An element a € K is integral over D if and only if Da] is a finitely generated 
D-module (Theorem 8.2.14). 

2. If O is the collection of all elements of K which are integral over D, then O is a 
subring of K (Theorem 8.2.15). 


Let K bea field containing the integral domain D and let O be the ring of integral 
elements (with respect to D) as in the above Lemma. 


Theorem 9.8.4 Let O be the ring of integral elements with respect to the Noetherian 
subdomain D of field K. Then O, contains all elements of K that are integral with 
respect to O. Since the field of fractions F (O) lies in K, we see that O is integrally 
closed. 


Proof Assume that a € K is integral over O. Thus there exist coefficients 
do, 41,---,4n—1 € O with a” + a,_ja"~! +--- + ag = 0. Therefore, we see 
that a is integral over the domain D[ao, a), ..., @,—1]. In turn each a; is integral 
over D, so by repeated application of Lemma 9.8.2 (using both parts), we conclude 
that Diao, a1, ..., @—1, a] 18 a finitely generated D-module. Since D is Noetherian, 
the submodule D[a] is also finitely generated. This means a is integral over D, i.e., 
a € O. That O is integrally closed follows immediately. 


The field K is said to be algebraic over a subfield L if and only if every element 
of K is a zero of a polynomial in L[x]—equivalently, the set of all powers of any 
single element of K are L-linearly dependent. Clearly if K has finite dimension as 
a vector space over L, then K is algebraic over L. 


Corollary 9.8.5 Suppose K is algebraic over the field of fractions F = F(D) of 
its subdomain D and let O denote the ring of elements of K that are integral with 
respect to D. Then for each element a € K, there exists an element dy € D, such 
that da € O. It follows that K = FO = F(O). 


Proof Suppose a € K.Then the powers of K are linearly dependent over F’. Thus, for 
some positive integer m, there are fractions fo, f1,... fm—11n F(D) = F, such that 


a” = fina"! +--+ fiat fo. (9.5) 


Let d be the product of all the denominators of the f;. Then d and each df; lies in 
O. Multiplying both sides of Eq. (9.5) by d” we obtain 


(da)™ = dfm—\(da)™ |! +---+d™! fi(da) + d™ fo. 


Since all coefficients d' JSm—i lie in O, we see from the statement of Theorem 9.8.4 
that da must lie in O. Since d € D C O, andd £ 0, by its definition, we see that 
a € FO C F(O). Since a was arbitrarily chosen in K, the equations K = FO = 
F (©) follow. 
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There is an interesting property that O inherits from D. 
Theorem 9.8.6 If every prime ideal in D is maximal, then the same is true of O. 


Remark Recall that when we speak of an ideal being maximal, we are referring to 
the poset of all proper ideals of a ring. 


Proof of Theorem 9.8.6 Suppose P is a prime ideal of the ring O. By way of con- 
tradiction, we assume that P is not a maximal ideal in O. Then P properly lies in a 
larger ideal J of D, producing the proper containment P C J C O. Since both P and 
J are properly contained in O, neither contains the multiplicative identity element 1, 
which lies in D. Thus P and J intersect D at proper ideals of D. But Pp := PM D 
is clearly a prime ideal of D and so by hypothesis is a maximal ideal of D. Since 
J D isa proper ideal of D containing Po, we must conclude that Pp = JO P. 

Now choose 3 € J\P. Since / is an algebraic integer, it is the zero of some 
monic polynomial of D[x]. Therefore, there is a non-empty collection of monic 
polynomials p(x) in D[x], such that p(3) lies in P. Among these, we choose p(x) 
of minimal degree. Thus we have 


p(B) = 8" + bm 18"! +--+ + bi B+ bo € P (9.6) 

with all b; € D and the degree m minimal. Since (3 lies in the ideal J, so does 
BB"! + Bm—18"* + +++ + bo + bi) = p(B) — bo. (9.7) 
Since p(@) € P C J, the above Eq. (9.7) shows that bo, being the difference of two 
elements of J, must also lie in J. But then bg € DOJ = Po © P. So it now follows 


the left side of Eq. (9.7) is a product in O that lies in the prime ideal P. Since ( does 
not lie in P by our choice of (3, the second factor 


8"! $+ bm 1B"? + + + bo 8 + dy 
is an element of P. But that defies the minimality of m, and so we have been forced 


into a contradiction. 
It follows that P is a maximal ideal of O. 


9.9 Factorization Theory in Dedekind Domains 
and the Fundamental Theorem of Algebraic 
Number Theory 


Let D be an integral domain. We say that D is a Dedekind domain if and only if 


(a) D is Noetherian, 
(b) Every non-zero prime ideal of D is maximal, and 
(c) D is integrally closed. 
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The reader will notice that if the integral domain D is a field, then D possesses 
only one proper ideal {0}, which is both prime and maximal. Such a D is its own field 
of fractions, and any element a € D is a zero of the monic polynomial x — a and so 
is integral over D. Thus, in a trivial way, fields are Dedekind domains. However, the 
interesting features of a Dedekind domain involve its nonzero ideals. For the rest of 
this section, we begin our examination of the ideal structure of an arbitrary Dedekind 
domain, D. 

The reader is reminded of the following notation that is customary in ring theory: If 
A and B are ideals in a commutative ring R, one writes AB for the set os ajbj\ai € 
A, bj € B,n € N}, the ideal generated by all products of an element of A with an 
element of B. The ideal AB is called a product of ideals A and B. 


Lemma 9.9.1 Assume that P;, Po, ..., P, are maximal ideals of the integral domain 
D, and that P is a prime ideal satisfying 


Pi P2---P, CP. 


Then P = P; for some i. 


Proof If P € P;, i = 1,2,...,7r, then, since each P; is a maximal ideal distinct 
from P we may find elements aj € P)\P,i = 1,2,...,r. Thus, aja2---a, € 
P| P2--- P- © P. Since P is a prime ideal, we must have a; € P for some i, which 
is a contradiction. 


Lemma 9.9.2. Any non-zero ideal of D contains a finite product of non-zero prime 
ideals. 


Proof We \et A be the family of all non-zero ideals of D for which the desired 
conclusion is false. Assume A is non-empty. As D is Noetherian, A must contain a 
maximal member Jo. Clearly, J) cannot be a prime ideal, since it is a member of A. 
Accordingly, there must exist a pair of elements a, 3 € D\Io with af € Ip. Next, 
form the ideals Jj = Jo + aD and Ij = Ip + GD. By maximality of Jo, neither 
of the ideals /j and, Jj can lie in A. Therefore, both /j and Jj respectively contain 
products P;--+ Pm and Pm+41 +++ Py of non-zero prime ideals. Since [Jj © Io, and 
Ij{j contains P; --- Py, we have arrived at a contradiction. Thus A = § and the 
conclusion follows. 


At this point it is convenient to introduce some notation. Let D be a Dedekind 
domain with field of fractions E. If J C D is an ideal, set 


I! ={ae Ela-I CD}. 


Note that D~! = D, for if a+ D C D,thena=a-1e€ D. Next note that J C J 
implies that J~! D> J7!. 
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Lemma 9.9.3 /f I is a proper ideal of D, then I~! properly contains D. 


Proof Clearly D € I~!. Let 0 4 a € I; by Lemma 9.9.2, we have prime ideals 
P,, Po,..., P, with Pj P2---P, GC aD C I. We may assume, furthermore, that 
the index r above is minimal. Since D is Noetherian, there is a maximal ideal M 
containing 7; thus we have P; P2--- P, C M. Applying Lemma 9.9.1 we conclude 
that P; = M for some index 7. Re-index if necessary so that P; = M. This says that 


(i) MP)P3---P, CaD C M, and 
(ii) P2P3---P, Gad, 


by the minimality of r. Let 6 € Py P3--- P,\aD and set \ = 3/a. Then \ € E\D; 
yet, by (i) and (ii), 


AM = Baa!I C Baa!M Cau!MP2---P, Ca (aD) C D, 


which puts \ € J~!\D. 
Lemma 9.9.4 /f I © D is an ideal then I~! is a finitely generated D-module. 


Proof 1f0 £4 a € I, then I~! C (aD)~! = D[a“!], which is a finitely-generated 
D-module. Since D is Noetherian, D[a~!] is Noetherian (Theorem 8.2.12), and so 
I~' is finitely generated (Theorem 8.2.11). 


Theorem 9.9.5 If I C D is an ideal, then I1=D. 


Proof Set B = I~'I © D, so B is an ideal of D. Thus, ~'JB~! = BB™! C D; 
which says that I~'B-! c J~!, But then for any 3 € Bo}, ig c 77}, forcing 
"8 € I~!. Since, by Lemma 9.9.4, I~! is a Noetherian D-module, so is its 
D-submodule I~! D[3|J~![], From D[B] € I~![8] C I~! we infer that D[3] is 
a finitely-generated D module and so 7 € E is integral over D. As D is integrally 
closed, it follows that G € D. Therefore, Bo! CDC B7! and so D = B7!. An 
application of Lemma 9.9.3 completes the proof. 


Corollary 9.9.6 If I, J ¢ D are ideals, then IJ)~! = 17 'J7'. 


Proof D = D* = I-'1-J-'J = 1-'J~"JJ. Therefore, 7 J)~! = DUJ)~! = 
Mya ndsy =F riper ts. 


The following theorem gives us basic factorization theory in a Dedekind domain. 


Theorem 9.9.7 Let D be a Dedekind domain and let I € D be an ideal. Then there 
exist prime ideals P,, P2,--- , P- © D such that 


I= P,\P2---P,. 
The above factorization is unique in that if also 


£= Q107+-+Os, 
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where the Q;’s are prime ideals, thenr = s and Q; = Pri), for some permutation 


T Of 1,2,-+-,7r 


Proof By Lemma 9.9.2 we know that the ideal J C D contains a product of prime 
ideals: P; P2--- P, C I. We shall argue by induction that if P; P2---P, C 7, and 
if r is minimal in this respect, then, in fact, P; P2--- P, = I. Since prime ideals are 
maximal, the result is certainly true when r = 1. Next, as D is Noetherian, we may 
select a maximal ideal M containing J; thus we have 


Pi P2---P» CI CM. 
Applying Lemma 9.9.1 we conclude that (say) M = P;. But then 
M~'MP2---P,C M“'IC M'M=D. 
That is to say, M —!7 is an ideal of D and that 
P)P3--» P, = M~!M,Pp--- P, © MT. 


We apply induction to infer that Po P3--- P, = M~'J. If one multiplies both sides 
by M, then, noting again that WM! = D, one obtains P; P,--- P, = 1. This proves 
the existence of a factorization of J into a product of prime ideals. 

Next we prove uniqueness. Thus, assume that there exist prime ideals P;, P2,..., 


P,, O1, Q2, ae | Os with 
Pi P2+-- Pr = QiQo--- Qs. (9.8) 


We argue by induction on the minimum of r and s. If, say, r = 1, then we set 
P = P, and we have a factorization of the form P = Q;Q2--- Qs. By Lemma 
9.9.1 we may assume that P = Q; and so P = PQ2--- Qs. Multiply both sides 
by P~! and infer that D = Q>--- Qs. If s > 2 this is an easy contradiction as then 
D=Q)---Qs © Qo. 

Thus we may assume that both s and r are at least 2. Since D is Noetherian, we 
may find a maximal ideal M containing the common ideal in (9.8) above, so 


Pi P-: = Q1Q02:--Qs GM. 


Anapplication of Lemma 9.9.1 allows us to infer that (again poss after ae 
that M = P, = Q). Upon multiplying both sides by M~! one obtains P)--- P, = 
Q2--- Qs. Induction takes care of the rest. 


We close by mentioning the following result; a proof is outlined in Exercise (8) in 
Sect.9.9. It can be viewed as saying that Dedekind domains are “almost” principal 
ideal domains. 
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Theorem 9.9.8 Let E D> Q be a finite field extension and let D = Og. Then any 
ideal I C D can be expressed as I = Da+ D8 for suitable elements a, 3 € I. 


9.10 The Ideal Class Group of a Dedekind Domain 


We continue to assume that D is a Dedekind domain, with fraction field E. A D- 
submodule B C E is called a fractional ideal if it is a finitely generated D-module. 


Lemma 9.10.1 Let B be a fractional ideal. Then there exist prime ideals 
Pi, Po,..., Pr, Q1, Q2,-.., Qs such that B = P\P)---P,Q;'Q>'+--Qz. (It 
is possible that either r = 0 or s = 0.) 


Proof Since B is finitely generated, there exist elements aj, a2,..., ax € E with 
B= D[q,,..., ag]. Since E is the fraction field of D, we may choose an element 
GB € Dwith Ga; € D, i = 1,2,...,k. Therefore, it follows immediately that GB C 
D,i.e., 3B is an ideal of D. Therefore, apply Theorem 9.9.7 to obtain the factorization 
GB = P,P2--- P, into prime ideals of D. Next, factor GD as GD = Q1Q2--: Qs, 
prime ideals, and so (3D)~! = QO; O05 Bead Q, ' Thus, B = B-!P,Po---P, = 
D[B-']P1 Py --- Pp = (@D)~! Py Po ++ Pr = Py P2--+ PrQ7'Q7'--- O51. 


Corollary 9.10.2 The set of fractional ideals in E forms an abelian group under 
multiplication. 


Proof It suffices to prove that for any collection P,,--- , P-, Q1,..., Qs of prime 
ideals of D, the D-module P| - - - P, o. vee GU is finitely generated over D. Leta € 
Q,Q2---Qs,andsoaD C Q1Q2... Qs. In turn, it follows that OF tee (1. = 
(aD)~! = a~!D = D[a™']. But then 


Pye POY! OF! © OF! ++ OF" S Dla“"]. 


That is to say, P) --- P, On ee Oo. is contained in a finitely-generated module over 
the Noetherian domain D and hence must be finitely generated. 


A fractional ideal B C E is called a principal fractional ideal if it is of the form 
aD, for some a € E. Note that in this case, B-!=a7!D.Itis easy to show that if 
D is a principal ideal domain, then every fractional ideal is principal (Exercise (1) 
in Sect. 9.10). 

If F is the set of fractional ideals in E we have seen that F is an abelian group 
under multiplication, with identity D. If we denote by P the set of principal fractional 
ideals, then it is easy to see that P is a subgroup of F; the quotient group C = F/P 
is called the ideal class group of D; it is trivial precisely when D is a principal 
ideal domain. If D = Og for a finite extension E D Q, then it is known that C is a 
finite group. The order h = |C| is called the class number of D (or of E) and is a 
fundamental (though somewhat subtle) invariant of E. 
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9.11 A Characterization of Dedekind Domains 


In this section we’ll prove the converse of Theorem 9.9.7, thereby giving a charac- 
terization of Dedekind domains. 

To begin with, let D be an arbitrary integral domain, with fraction field E. In 
analogy with the preceding section, if J C D is an ideal, we set 


I"! = {ae Elal C D}, 


and say that J is invertible if I~'I = D. 


Lemma 9.11.1 Assume that an ideal I of D admits factorizations into invertible 
prime ideals: 
P\P)+++ Pp = 1 = Q1Q2-*- Qs. 


Then r = s, and (possibly after re-indexing) P; = Q;, i= 1,2,...,r. 


Proof We shall apply induction on the total number of ideals r + s. Among the 
finitely many ideals in {P;} U {Qj} chose one that is minimal with respect to the 
subset relationship. By reindexing the ideals and transposing the symbols “P” and 
“Q”, if necessary, we may assume this minimal prime ideal is P,. Since P is prime 
and contains [| Q;, we must have Q; C P, for some index j. After a further 
reindexing, we may assume j = |. Now, since P; was chosen minimal in the finite 
poset ({P;} U{Q;}, S), one has P} = Q). Then by invertibility of the ideals, we see 
that 


Peels Oy PS Py4+ Pe 05174 Oe. (9.9) 


Applying induction to Eq. (9.9) forces r = s and (with further reindexing) P; = Q;, 
i =2,...,s. This completes the proof. 


Lemma 9.11.2 Let D be an integral domain. 


(i) Any non-zero principal ideal is invertible. 
(ii) If0 4 x € D, and if the principal ideal xD factors into prime ideals as 
xD = P| P2--- P,, then each P; is invertible. 
Proof Clearly if x € D, then (xD)~! = x~!D and (xD)(x~!D) = D, so xD is 
invertible, proving (i). For (ii), simply note that for any i = 1,2,..., 7, that 


D = P{P2--- P, -(x~'D) = P;(Pi +++ P-1 Pig +++ Pr)(x!D), 


forcing P; to be invertible. 


Now assume that D is an integral domain satisfying the following condition: 
(*) If J C D is an ideal of D, then there exist prime ideals P}, P2,..., P) C D 


such that 
I=P,P)---P,. 
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Note that no assumption is made regarding the uniqueness of the above factoriza- 
tion. We shall show not only that uniqueness automatically follows (See Corol- 
lary 9.11.6, below), but that D is actually a Dedekind domain, giving us the desired 
characterization. 


Theorem 9.11.3 Any invertible prime ideal of D is maximal. 


Proof Let P be an invertible prime ideal and let a € D\P. Define the ideals J = 
P+aD, J = P +.a2D, and factor into prime ideals: 


I= P,P2---P,, J = Q1Q2:--Qs. 


Note that each P;, Q; > P. We now pass to the quotient ring D = D/P and set 
P; = PLP ob = Wy 2yeicd hh QO; = Q;/P, j = 1,2,...,s. Clearly the ideals 
Pj, Q;. i= 1,2,...,r, j = 1,2,...,s5 are prime ideals of D. Note that where 


a=a+P,wehave/ =aD, J = @D, principal ideals of D. 
Note that 


aD=I=P,::-P,, @Z2D=J= O,:::-Q;s, 
which, by Lemma 9.11.2 part (ii), imply that the prime ideals P;,--- , P, and 
oO bo? ‘Oe, are invertible ideals of D. However, as J = rc then 


2 2 
Q,::-O,=P,:::P,, 


by Lemma 9.11.1 we conclude that s = 2r and (possibly after reindexing) P; = 
Q>j-1 — OQ; j = 1,2,...,r. This implies that P; = Qoj-1 = Qo), j = 
1,2,...,r, and so J = J*. Therefore, P C J = [7 = (P+ aD) C P* + aD. 
If x € P we can write x = y + az, where y € P*, z € D. Thus,az=x-—ye 
P. Asa ¢ P, and P is prime, we infer that z ¢ P. Therefore, in fact, we have 
P C P*+aP C P, and 50 it follows that P = P* +aP. As P is invertible by 
hypothesis, we may multiply through by P~! and get D = P + aD = I, and so P 
is maximal. 


Theorem 9.11.4 Any prime ideal is invertible, hence maximal. 


Proof Let P bea prime ideal of D and let x € P. We may factor the principal ideal xD 
as xD = P| P2--- P,. By Lemma 9.11.2 (11) the prime ideals Pj, i = 1,2,...,r are 
invertible, and hence, by Theorem 9.11.3, they are maximal. Now apply Lemma 9.9.1 
to infer that P = P;, for some i and hence is invertible. Theorem 9.11.3 now forces 
P to be maximal. 


The following two corollaries are now immediate: 
Corollary 9.11.5 Any ideal of D is invertible. 


Corollary 9.11.6 Any ideal of D factors uniquely into prime ideals. 
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Theorem 9.11.7 D is Noetherian. 


Proof Let I C D be an ideal. Being invertible by Corollary 9.11.5, there exist 
elements aj, a2,...,4; € I, bi, bo,...,b, € I~! such that ©) ajb; = 1. If x € J, 
then bjx € D,i = 1,2,...,r andx = >i(xb;)aj;, i.e., J = D[ay,ao,...,a,], 
proving that / is finitely generated, and hence D is Noetherian. 


Our task of showing that D is a Dedekind domain will be complete as soon as we 
can show that D is integrally closed. To do this it is convenient to introduce certain 
“overrings” of D, described below. 

Let D be an arbitrary integral domain and let E = F(D), the field of fractions of 
D.If P C Disa prime ideal of D, we set 


Dp ={a/Be Ela, BED, B¢ P}. 


It should be clear (using the fact that P is a prime ideal) that Dp is a subring of 
E containing D. (The reader will recall from Sect.9.6.4 that Dp is the localization 
of D at the prime ideal P.) It should also be clear that the same field of fractions 
emerges: F(Dp) = E. 


Lemma 9.11.8 Let I be an ideal of D, and let P be a prime ideal of D. 


(i) If Z P then DpI = Dp. 
(ii) DpP~! properly contains Dp. 


Proof Note that Dp/ is an ideal of Dp. Since J Z P, any element a € /\ P is aunit 
in Dp. It follows that DpI = Dp, proving (i). If Dp P~! = Dp, then multiplying 
through by P, and using the fact that P-!P = D, we obtain Dp = DpP. Therefore, 
there is an equation of the form 


: 
l= ee where each 7; € D, s; € D\P, x; € P. 


i=l"! 


Multiply the above through by s152---s,, and set s = Spo Sj-1Sj41 °° Spe b= 
1,2,...,k. Then 
k 
SpreSp = SS ristxi e P, 
i=1 


which is an obvious contradiction as each s; ¢ P and P is prime. 


Lemma 9.11.9 Ifa é E then either a € Dp or av! e Dp. 


Proof Write a = ab~', a,b € D, and factor the principal ideals aD and bD as 
aD = P*I, bD = P/ J, where I, J £ P. Thus, Dpa = Dp P*°, Dpb = DpP?!. 
Assuming that e > f, we have ab~! € Dp P°P~/ = Dp P*S € Dp. Similarly, if 
e < f one obtains that ba! € Dp. 
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Lemma 9.11.10 Dp is integrally closed. 


Proof Let a € E\Dp be integral over Dp. Then there is an equation of the form 
a” + ama" 1 +---+a;a+a9 = 0, 


where ao, d1,...,dm—1 € Dp. Since a ¢ Dp, we have, by Lemma 9.11.9 that 
avle Dp; therefore, 


: ie 1 m—1 
az ye — aml (am—10 + ap) 
1 1 


= Am—1 + Gn—2—°+°* 4 
a 


€ Dp, 


qm-1 


a contradiction. 


Theorem 9.11.11. D =/NDp, the intersection taken over all prime ideals P © D. 


Proof Let ab~'! € NDp, where a, b € D. Factor the principal ideals aD and bD as 
w= PO PY xP, DD = PP PP ... P/*; here all exponents are > 0. It suffices 
to show thate; > f;, i = 1,2,...,r.Fixanindexi andset P = P;,e=e;, f = fi. 
Therefore, aDp = DpP*, bDp = DpPS, which gives ab-'Dp = DpP*!. 
Since ab—! € Dp, and since Dp P! properly contains Dp by Lemma 9.11.8, we 
haveab-'Dp © Dp © P~° forallintegers c > 0. Thus, it follows that e— f > 0. 


As an immediate result, we get 
Corollary 9.11.12 D is integrally closed. 


Proof Indeed, if a € E and is integral over D, then it is, a fortiori, integral over Dp. 
Since Dp is integrally closed by Lemma 9.11.10 we have that a € Dp. Now apply 
Theorem 9.11.11. 


Combining all of the above we get the desired characterization of Dedekind 
domains: 


Corollary 9.11.13 D is a Dedekind domain if and only if every ideal of D can be 
factored into prime ideals. 


We conclude this section with a final remark. 


Corollary 9.11.14 Any Dedekind domain (in particular, any algebraic integer 
domain) is a PID if and only if it is a UFD. 


Proof By Theorem 9.4.3 any principle ideal domain is a unique factorization domain. 
So it only remains to show that any Dedekind domain that is a UFD, is also a PID. 
Assume that the Dedekind domain D is a UFD. We begin by showing that all 
prime ideals of D are principal. Thus, let 0 #4 P C D be a prime ideal and let 
0 Ax € P. As Dis a UFD, we may factor x into primes: x = pi p2--: pe € P; 
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this easily implies that p; € P for some index i. Set p = p;. The principal ideal pD 
generated by p is a prime ideal and hence is maximal (as D is Dedekind), forcing 
P=pD. 

Nextlet 7 C Dbean arbitrary ideal of D. Since D is a UFD, each nonzero non-unit 
of D can be factored into a unique number of primes in D; we shall denote this number 
by /(x) (the “length” of x). Now select an element x € J such that /(x) is a minimum. 
Ifalso y € J, and x does not divide y, we take z to be the greatest common divisor of 
x and y; clearly /(z) < I(x). We setx = za, y = zb, where a, b are relatively prime. 
If the ideal D(a, b) of D generated by both a and b is not all of D, then since D is 
Noetherian, the non-empty collection {proper ideals J | J > D(a, b)} must contain 
a maximal ideal M. Since maximal ideals are prime, the assumption guarantees that 
there is a prime p with M = pD D D (a,b). But this says that p divides both a 
and b, which is impossible. Therefore, the ideal D(a, b) must be all of D and so 
1 € Dca, b). Therefore, there exist elements x9, yp € D with 1 = x9a + yob, which 
leads to z = xpaz + yobz = xox + yoy € I. Since /(z) < /(x), this contradiction 
implies that the ideal J was principal in the first place. 


9.12 When Are Rings of Integers Dedekind? 


The tour de force of the theory of Dedekind Domains given in the preceding section 
would lead one to suspect that this theory should apply as well to the rings of integral 
elements of Sect.9.8—that is, rings Ox where K is a field and integrality is with 
respect to an integrally closed sub-domain D in which every prime ideal is maximal.? 

In order to make such a dream a reality, the definition of Dedekind domain would 
also demand that Ox be Noetherian, and that seems to ask that the dimension of K 
over the subfield F(D) be finite. Even given that, is it really Noetherian? So, let us 
say that a ring O is a classical ring of integral elements if and only O = Ox is the 
ring of elements of K that are integral with respect to a subdomain D, having the 
following properties: 


(CRI1) The domain D is integrally closed in its field of fractions F = F(D). (F is 
regarded as a subfield of K.) 

(CRI2) Every prime ideal of the domain D is a maximal ideal of D. 

(CRI3) K has finite dimension as an F-vector space. 


Lemma 9.12.1 [f Ox is a classical ring of integral elements, then 


1. It is integrally closed. 
2. Every prime ideal of Ox is maximal. 


Proof Conclusion | is immediate from Theorem 9.8.4. Theorem 9.8.6 implies con- 
clusion 2. 


° Somehow this historical notation Ox seems to glorify K at the expense of the subdomain D, which 
is hardly mentioned. 
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Remark Notice that the proof did not utilize (CRI3). 


The goal of this section is to present a condition sufficient to force a classical ring 
of integers to be Dedekind. By Lemma 9.12.1 all that is needed is a proof that Ox 
is Noetherian. 

Since the proof involves bilinear forms and dual bases, it might be useful to review 
these concepts from linear algebra in the next two paragraphs enclosed in brackets. 

[Let V be a finite-dimensional vector space over a field F. In general, a mapping 
B:V x V — F isasymmetric bilinear form if and only if: 


i) B(x, ay + bz) = aB(x, y) + bB(x, z) 
(ii) B(ax + by, z) = aB(x,z)+ DBU,z) 
(iii) Bix, y) = Bly, x) 


for each (x, y,z,a,b) € Vx Vx K x F x F.. Thus for each vector x € V, statement 
(1) asserts that the mapping A, : K — F defined by y B(x, y) is a functional, 
that is, a vector of the dual space V* := Homp(V, F). The bilinear form B is said to 
be non-degenerate if and only if the linear transformation \ : V > V* defined by 


xt> Ax, forallx e K 


is an injection, and so, as V is finite-dimensional over F,, a bijection. Thus the bilinear 
form B is non-degenerate if and only if ker \ = 0—that is, the only element x, such 
that B(x, y) = 0 for every element y € K,is x = 0. 

For any finite F-basis {x;} of K, there exists a functional fj, such that fj(x;) = 1, 
while f;(x;) = 0 for j # i. If the bilinear form B is non-degenerate, the previous 
paragraph tells us that the associated mapping \ : V > V* := Homr(K, F) is 
surjective. Thus for each functional f; described just above, there exists a vector y; 
such that A(y;) = fj. Thus, for each index 7, one has 


B(yj,x;) = 6ij, where 6;; = 1 if i = j and is zero otherwise. 


We call { y;} a dual basis of {x;} with respect to the form B. Note that a dual basis exists 
only if the vector space V is finite dimensional, and the form B is non-degenerate. ] 

Now, as above, consider D, F = F(D), K a field containing F as a subfield so 
that dim(K f) is finite, and consider Ox ,, the subring of elements of K that are integral 
with respect to D. Let T be the collection of F-linear transformations K —> F. This 
collection contains the “zero” mapping K — {0}, and of course many others. We 
say that a transformation T € T is tracelike if and only if 


T(Ox) CD. 


We now offer the following: 


Theorem 9.12.2 Let Ox be a classical ring of integral elements of a field K, with 
respect to the subdomain D so that conditions (CID1) (CID2) and (CID3) hold for 
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D. Suppose a non-zero tracelike transformation T : K — F exists. Then Ox is a 
Noetherian domain, and so (by Lemma 9.12.1) is a Dedekind domain. 


Proof LetT : K — F be the tracelike transformation whose existence was assumed. 
From T we define a symmetric bilinear form Br : K x K — F, by the following 
recipe: 

Br(x, y):=T(xy), forallx,y eK. 


Next we show that the form Br is non-degenerate. If this form were degenerate 
there would exist a non-zero element x in the field K such that T(xy) = 0 for all 
elements y € K. Since xy wanders over all of K as y wanders over K, we see that 
this means T(K) = 0. But that is impossible as T was assumed to be non-zero. 

Now by (CID3) K has finite dimension over F and so we may choose an F-basis 
of K, say X := {x1,...,Xn}. By Corollary 9.8.5, we may assume that each basis 
element x; lies in Ox. Since K is finite-dimensional, and Br is a non-degenerate 
form, there exists a so-called dual basis {y;} of K, such that Br(y;,x;) is Oifi A j, 
and is 1,ifi = j. 

Now consider an arbitrary integral element 3 € Ox. We may write 3 = > ajyj, 
since {y;} is a basis for K. Then for any fixed index j, we see that T(G-xj;) = aj, 
where a; € F. Since both x; and 3 belong to Ox, so does the product (x ;. Since T 
is tracelike, we must conclude that T(G-x;) = a; belongs to D. Thus @ is a D-linear 
combination of the y;. Since 2 was an arbitrary element of Ox, we must conclude 
that 

Ox © Dy ® Dy2 ®:+: ® Dyn := M 


a finitely generated D-module. Now D is Noetherian, and so, by Noether’s Theorem 
(Theorem 8.2.12), M is also Noetherian. By Lemma 8.2.2, part (i), each D-submodule 
of M is also Noetherian. Thus Ox is also Noetherian, and the proof is complete. 


Where does this leave us? Chap. 11 provides a detailed study of fields and field 
extensions. Among field extensions, F C K, there is a class called separable exten- 
sions. It willemerge that for any separable extension F C K withdim, K finite, there 
does exist a non-zero tracelike linear transformation K — F (see Corollary 11.7.6 
in Sect. 11.7.4). Moreover, the extension F C K is always separable if the field F 
(and hence K’) has characteristic zero (Lemma 11.5.4). 

Of course, historically, the prize integral domain motivating all of this, is the ring 
of algebraic integers. That would be the ring of integral elements Ox where D = Z, 
the ring of integers, and the extension field K has finite-dimension as a vector space 
over the field of rational numbers Q. The result of the previous paragraph (anticipating 
results from Chap. 11, to be sure) together with Theorem 9.12.2 yields the following. 


Corollary 9.12.3. Any ring of algebraic integers is a Dedekind domain. 
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9.13 Exercises 


9.13.1 Exercises for Sects. 9.2, 9.3, 9.4 and 9.5 


1. Earlier, in Theorem 7.4.7, we learned that every finite integral domain is a field. 
Prove the following theorem which replaces finiteness by finite dimension. 


Theorem 9.13.1 Suppose an integral domain D is finite-dimensional over a subfield 
F. Then D is a field. 


[Hint: If F = D their is nothing to prove, so we may assume dimr D =n > | and 
proceed by induction on n. Choose an element a € D — F, consider a minimal 
submodule M for the subring F[a] (Why does it exist?) Prove that it has the 
form mF [a] (i.e. it is cyclic,) and that there is a right F[a]-module isomorphism 
o: M => F[x]/I, where ¢(a) = x + J, and J is the ideal in F[x] consisting all 
polynomials r(x) € F[x] such that mr(a@) = 0. Cite the theorems that force J to 
be a maximal principal ideal F[x] p(x), and explain why p(x) is an irreducible 
polynomial. Noting that D is a domain, conclude that p(q) is the zero element of 
Dandso K := Fla] ~ F[x]/F[x]p(x) as rings, and so K is a field. Finish the 
induction proof.] 


2. Let D bea fixed unique-factorization-domain (UFD), and let p be a prime element 
in D. Then the principal ideal (p) := pD is a maximal ideal and D/(p) is a 
field. The next few exercises concern the following surjection between integral 
domains: 

mp : Dix] > (D/(p))Lx1, 


which reduces mod p the coefficients of any polynomial of D[x]. 

Show that m » is a ring homomorphism. Conclude that the principal ideal in D[x] 
generated by p is a prime ideal and from this, that p is a prime element in the 
ring D[x]. 

3. This exercise has two parts. In each of them D is a UFD and is embedded in the 
polynomial ring D[x] as the polynomials of degree zero together with the zero 
polynomial. For every prime element p in D we let F, := D/(p), a field, and let 
F := F(D) be the field of fractions of D. We fix a maximal ideal M in D[x] and 
set K = D[x]/M, another field. Our over-riding assumption in this exercise is 
that MM D = (0)—that is, M contains no non-zero polynomials of degree zero. 
Prove the following statements: 


(a) For any prime p € D, show thatmp(M) = F>[x]. 
[Hint: Ifm,(M) 4 F,[x], then ker(m,)-+ M is a proper ideal of D[x]. Since 
M 1 D = (0), we see that p € ker(m,) — M, contradicting the maximality 
of M.] 

(b) Letn be the minimal degree of a non-zero polynomial in M. (Since MN D = 
(0), 1 is positive.) Then there exists a prime element b(x) in D[x] of degree 
n such that M = D[x]b(x), a principle ideal of D[x]. 
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[Hint: Choose a polynomial b’(x) € M such that deg b’(x) = n. Since D[x] 
is a UFD, b’(x) is a product of prime elements of D[x], and so one of these 
prime factors b(x), must belong to the prime ideal M. Since MM D = (0), 
deg b(x) =n. 

Now consider any non-zero polynomial a(x) € M. Using the fact that F'[x] 
is a Euclidean domain as well as the minimality of n, show that there exists 
an element d € D such that da(x) is divisible by b(x). Since b(x) is a prime 
element of D[x] which does not divide a (because of its degree), b(x) divides 
a(x), because it is a prime element. ] 


4. Prove the following result: 


Theorem 9.13.2 Assume D is a UFD with infinitely many association classes of 
primes. Then, if M is a maximal ideal of D|x], MM D ¢ 0—that is, M contains a 
non-zero polynomial of degree zero. 


[Hint: By way of contradiction assume MM D = (0). Let p be any prime ele- 
ment in D. By Exercise 3, part (a), there exists a polynomial a(x) € M, such 
that m)((a(x)) = 1, the multiplicative identity element of F,[x]. By part (4i) of 
Exercise 3, a(x) = b(x) - e(x), for some e(x) € D[x]. Apply the morphism m, 
to this factorization to conclude that m(b(x) has degree zero in the polynomial 
ring F’y[x]. Thus, writing b(x) = > b;x', b; € D, we see that if i > 0, then D; is 
divisible by p. But this is true for any prime, and since there are infinitely many 
association classes of primes, this can be true only if b; = 0, fori > 0. But then 
b(x) € DN M, acontradiction. Why?] 


5. Let D be a UFD with infinitely many association classes of primes. Show that 
any maximal ideal of D[x] is generated by a prime p € D and a polynomial 
p(x) such that mp) (p(x)) is irreducible in Fp[x]. 

6. Let D be a UFD with finitely many association classes of primes with representa- 
tives {p1,..., Pn}.Seta = |] p; and form the principle ideal M = D[x](1+7x). 
Then MN D = (0). Show that M is a maximal ideal. [Hint: Show that each non- 
zero element of D[x]/M is a unit of that factor ring, using the unique factorization 
in D and the identity xm = —1 mod M.] 

7. (Another proof of Gauss’ Lemma.) Let D be a UFD. A polynomial h(x) in D[x] 
is said to be primitive if and only there is no prime element p of D dividing each 
of the D-coefficients of h(x). Using the morphism m p of the first exercise, show 
that if f and g are primitive polynomials in D[x], then the product fg is also 
primitive. [Hint: by assumption, there is no prime in D dividing f or g in D[x]. 
If a prime p of D divides fg in D[x] thenm (fg) = 0 = mp(f)-mp(g). Then 
use the fact that (D/(p))[x] is an integral domain. ] 

8. (Eisenstein’s Criterion) Let D be a UFD. Suppose p is a prime in D and 


F(x) = anx" + ---a1x +. a9 


is a polynomial in D[x] with n > 1 anda, # 0. Further, assume that 


9.13 Exercises 319 


(E1) p does not divide the lead coefficient ay; 
(E2) p divides each remaining coefficient a; fori = 0,...,m — 1; and 
(E3) p? does not divide ap. 


Show that f cannot factor in D[x] into polynomial factors of positive degree. In 
particular f is irreducible in F[x], where F is the field of fractions of D. [Hint: 
Suppose by way of contradiction that there was a factorization f = gh in D[x] 
with f and g of degree at least one. Then apply the ring morphism m , to get 


Mp(f) = mp(g)-mMp(h) = mp(an)x" #0. 


Now by hypothesis (E1) the polynomials f and m,(f) have the same degree, 
while the degrees of g and h are at least as large as the respective degrees of 
their m, images. Since f = gh, and degm)(f) = degm)(g) + degm)(h), we 
must have the last two summands of positive degree. Since (D/(p)[x] is a UFD, 
x must divide both mp(g) and m,(h). This means the constant coefficients of g 
and h are both divisible p. This contradicts hypothesis (E3).] 

Show that under the hypotheses of the previous Exercise 8, one cannot conclude 
that f is an irreducible element of D[x]. [Hint: In Z[x], the polynomial 2x + 6 
satisfies the Eisenstein hypotheses with respect to the prime 3. Yet it has a proper 
factorization in Z[x].] 

10 Let p be a prime element of D. Using the morphism m, show that the subset 


\o 


B :={p(x) =a + ayx +++++anx"|a; = 0 mod p, fori > 0} 


of D[x] is a subdomain. 

11 Let £ be an arbitrary integral domain. For each non-zero polynomial p(x) € E[x] 
let €(p) be its leading coefficient—that is the coefficient of the highest power 
of x possessing a non-zero coefficient. Thus p(x) is a monic polynomial if and 
only if €(p(x)) = 1, and for non-zero polynomials p(x) and q(x), one has 
L(p(x)q(x)) = £(p(x)) - €(¢(x)). Prove the following theorem: 


Theorem 9.13.3 Let D, is an integral domain and suppose D is a subring of D,. 
Then, of course, D is also an integral domain and we have a containment of polyno- 
mial rings: D[x] © D,[x]. Suppose a(x), b(x), and c(x) are polynomials in D,[x] 
such that c(x) = a(x)b(x). Assume the following: 

(i) a(x) and c(x) lie in D[x]. 

(ii) a(x) is monic. 
(iii) €(b(x)) € D. 
Then b(x) € D{[x]. 
In particular, if a(x), b(x), and c(x) are monic polynomials in D,[x] such that 
c(x) = a(x)b(x), then two of the polynomials lie in D[x] if and only if the third 
does. 
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{Hint: By hypothesis €(a(x))£(b(x)) = €(c(x)) € D. Write out the three polyno- 
mials a(x), b(x), and c(x) as linear combinations of powers of x with coefficients 
a;, b; and c;, with €(a(x)), £(b(x)), and £(c(x)) represented as a,, b; and Cy, 
respectively (of course n = s +t, as = 1 and b} = cy € D). By induction on k, 
show that b;_, € D for all k for which 0 < k <t.] 


12 Let F be a subfield of the field K. Consider the set 
Lr(K[x]) = (p(x) € K[x]|€(p(x) € F}U {0} 


consisting of the zero polynomial and all non-zero polynomials whose lead 
coefficient lies in the subfield F’. 


(a) Show that L-(K[x]) is a subring of K[x]. 

(b) Show that L ¢(K[x]) is a Euclidean Domain. [Hint: Use the Euclidean algo- 
rithm for F'[x] and the properties of the lead-coefficient function described 
in the preamble of the preceding exercise. ] 


9.13.2 Exercises on Localization 


1. Using the definitions of multiplication and addition for fractions, fill in all the 
steps in the proof of the following assertions: 


(a) (Ds, +) is an abelian group, 
(b) Multiplication of fractions Ds is a monoid. 
(c) In Ds multiplication is distributive with respect to addition. 


2. Let D be an integral domain with group of units U(D). Show that if S is a 
multiplicatively closed subset and S’ is either U(D)S or U(D) U U(D)S, then 
the localizations Ds and Dy are isomorphic. 

3. Below are given examples of a domain D and a multiplicatively closed subset of 
non-zero elements of D. Describe as best you can the local ring Ds. 


(a) D = Z; S is all powers of 3. Is Ds a local ring? 

(b) D = Z; S consists of all numbers of the form 375° where a and b are natural 
numbers and a + b > 0. Is the condition a + b > 0 necessary? 

(c) D = Z; S is all positive integers which are the sum of two squares. 

(d) D = Z; S is all integers of the form a* 4+5b*. 

(e) D = Z; S is all integers which are congruent to one modulo five. 

(f) D = Z[x]; S is all primitive polynomials of D. 


4. A valuation ring is an integral domain D such that if J and J are ideals of D, 
then either J C J or J C J. Prove that for an integral domain D, the following 
three conditions are equivalent: 


(i) D is a valuation ring. 
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(ii) ifa,b € R, then either Da C Db or Db C Da. 
(iii) Ifa € E := F(D), then either a € D or a'eD. 


Thus, we see that the localizations Dp defined at the prime ideal P (which were 
defined on p. 312) are valuation rings. 
5. Let D be a Noetherian valuation ring. 


(i) Prove that D is a PID. 
(ii) Prove that D contains a unique maximal ideal. (This is true even if D isn’t 
Noetherian.) 
(iii) Conclude that, up to associates, D contains a unique prime element. 


(A ring satisfying the above is often called a discrete valuation ring.) 
6. Let D be a discrete valuation ring, as in Exercise 5, above, and let 7 be the prime, 
unique up to associates. Define v(a) = r, where a = 1b, 7 fb. 
Prove that v is an algorithm for D, giving D the structure of a Euclidean domain. 
7. Let D be a Noetherian domain and let P be a prime ideal. Show that the local- 
ization Dp is Noetherian. 


9.13.3 Exercises for Sect. 9.9 


1. Let D be a Dedekind domain and let J C D be an ideal. Show that J C P for 
the prime ideal P if and only if P is a factor in the prime factorization of J. More 
generally, show that J C P° if and only if P° is a factor in the prime factorization 
of J. Conclude that J = P;! P;* --- P;" is the prime factorization of the ideal / 
if and only if for eachi = 1,2,...,r,1 ¢ Pe, but that J Z per, 

2. Let P and Q be distinct prime ideals of the Dedekind domain D. Show that 
PQ=PNQ. (Note that PQ C PN Q in any ring.) 

3. Assume that D is a Dedekind domain and that J = Pi 1 oe oe a 
J= Pj! pP ... P/”. Show that 


min{ey, fi} min{e,., f;} 
a ? 


max({ei, fi max{e,, fr 
I+J= P, ...?P {e1, fi} : {e Sry 


InJ = P; ee a 


Conclude that AB = (A+ B)(ANB). (Use Exercise 1. Note that this generalizes 
Exercise 2.) 

4. (Chinese Remainder Theorem) Let R be an arbitrary ring and let J, J C R be 
ideals. Say that J and J are coprime if 1+ J = R. Prove that if J, J are coprime 
ideals in R, then R/U OJ) = R/I x R/J.[Hint: map R/U NJ) > R/T x R/J 
byr+UNJ)% (r+I,r+J).Map R/I x R/J > RUO J), as follows. Fix 
xel, ye Jwithx+y=l,andlet(a+/,b+J)% (xb+ya)+Udn J).] 

5. Let R be a commutative ring and let J C R be an ideal. Assume that for some 
prime ideals P}, Po,..., P, one has J C P, U P) U---U P,. Show that J C P; 
for some i 
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6. Let D be a Dedekind domain and let J, J © D be ideals. Show that J and J are 
coprime (see Exercise 4) if and only if they do not have a common prime ideal 
in their factorizations into prime ideals. 

7. Let D be a Dedekind domain and let J C D be an ideal. If J = P/! Ps? --+ Py’ 
is the factorization of J into a product of prime ideals (where P; 4 P; ifi A j) 
in the Dedekind domain D, show that 


DiI S DIP) x D/P xe DIPS. 


8. Let D be a Dedekind domain with ideal J C D. Factor I as I = P;! P3*--- P,’ 
into distinct prime-power factors. 

(i) For eachi = 1,2,...,r select a; € av cal i= 1,2...,r. Use Exer- 
cise 7 to infer that there exists an element a € D with a = a; mod she 
el eee 

(ii) Show that a ¢€ J, and so the principal ideal aD C I. 

(iii) Assuming r > 1 if necessary, show that the principal ideal aD factors as 
aD = IJ, where I and J are coprime. 

(iv) Factor J = of : OF tee fs , a product of distinct prime powers. Show that 
there exists an element 3 € [\(Q; U Q2 U---U Qs). (See Exercise 5.) 

(v) Show that the ideal (a, G)D = I. 


The above shows that every ideal in a Dedekind domain is “almost principal” 


inasmuch as no ideal requires more than two elements in a generating set, proving 
Theorem 9.9.8. 
9. In the Dedekind domain D = Z[./—5] show that (3) = (3,4+/—5)(3,4 — 
/—5) is the factorization of the principal ideal (3) into a product of prime ideals. 
10. Let E be a finite extension of the rational field Q, and set R = Og. Let P be 
a prime ideal of R. Then PM Z is a prime ideal of Z so that PN Z = pZ, for 
some prime number p. Show that we may regard Z/pZ as a subfield of R/P, 
and that dimensions satisfy the inequality 


dim(z/pz)(R/P) < dimg E, 


with equality if and only if p remains prime in Og. [Hint: Don’t forget to first 
consider why R/P is a field.] 


9.13.4 Exercises for Sect. 9.10 


1. If D is a PID prove that every fractional ideal of E is principal. 
2. Let D be a Dedekind domain with fraction field E. Prove that E itself is not a 
fractional ideal (except in the trivial case in which D is a field to be begin with). 
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[Hint: If E properly contains D, one must prove that E is nota finitely generated 
D-module.] 

3. Let D be a Dedekind domain with ideal class group C. Let P C D bea prime ideal 
and assume that the order of the element [P] € C isk > 1. If P* = (a) := Dr, 
for some z € D, show that z is irreducible but not prime. 

4. An integral domain D such that every non-unit a € D can be factored into 
finitely many irreducible non-units is called an atomic domain. Prove that every 
Noetherian domain is atomic. (Note that uniqueness of factorization will typically 
fail.) 

5. Let D be a Dedekind domain with ideal class group of order at most 2. Prove 
that the number of irreducible factors in a factorization of an element a € D 
depends only on a.!° [Hint: Note first that by Exercise 4, any non-unit of D can 
be factored into finitely many irreducible elements. By induction on the minimal 
length of a factorization of a € D into irreducible elements, we may assume 
that a has no prime factors. Next assume that 7 € D is a non-prime irreducible 
element. If we factor the principal ideal into prime ideals: (7) = Q1Q2--- Q,; 
then the assumption guarantees that Q;Q2 = (a), for some a € D.Ifr > 2, 
then (7) is properly contained in Q; Q2 = (aq) and so ais a proper divisor of 77, a 
contradiction. Therefore, it follows that a principal ideal generated by a non-prime 
irreducible element factors into the product of two prime ideals. Now what?]2593 

6. Let D be as above, i.e., a Dedekind domain with ideal class group of order at 
most 2. Let 71, 72 € D be irreducible elements. As seen in Exercise 5 above, 
any factorization of 7172 will involve exactly two irreducible elements. Show 
that, up to associates, there can be at most three distinct factorizations of 7172 
into irreducible elements. (As a simple illustration, it turns out that the Dedekind 
domain Z[,/—5] has class group of order 2; correspondingly we have distinct 


factorizations: 21 = 3-7 = (1 +2/—5)(1 — 2/—5) = (4+/—5)(4 —V—5).) 


9.13.5 Exercises for Sect. 9.11 


1. Let D be a ring in which every ideal J C D is invertible. Prove that D is a 
Dedekind domain. [Hint: First, as in the Proof of Theorem 9.11.7, show that D is 
Noetherian. Now let C be the set of all ideals that are not products of prime ideals. 
Since D is Noetherian, C 4 Jimplies that C has amaximal member J. Let J C P, 
where P is a maximal ideal. Clearly J 4 P. Then JP~! C PP~! = D and so 
JP—' is an ideal of D; clearly J C JP~!. If J = JP~', then JP~! = P; P)--- P, 
so J = PP, P2--- P.. Thus J = JP~! so JP = J. This is a contradiction. Why?] 

2. Here is an example of a non-invertible ideal in an integral domain D. Let 


10See L. Carlitz, A characterization of algebraic number fields with class number two, Proc. Amer. 
Math. Soc. 11 (1960), 391-392. In case R is the ring of integers in a finite extension of the rational 
field, Carlitz also proves the converse. 
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D = {a+ 3bV—5| a,b € Z}, 
and let J = (3, 3—5), ie., J is the ideal generated by 3 and 3./—5. Show that 


T is not invertible. (An easy way to do this is to let J = (3), the principal ideal 
generated by 3, and observe that despite the fact that J 4 J, we have 1? = IJ.) 


. This exercise gives a very important class of projective R-modules; see Sect. 8.4.4. 


Let R be an integral domain with field of fractions F and let J be an ideal of 
R. Prove that if J is invertible, then J is a projective R-module. Conversely, 
prove that if the ideal J is finitely generated and projective as an R-module, 
then J is an invertible ideal. [Hint: Assuming that J is invertible, there must 
exist elements a1,02,...,Q, € I, (1, §2,...,8n € I~! with aij = 1. 
Let F be the free R-module with basis { f|, fo,..., fn} and defineo : I > F 
by (a) = SlaGi fi € F. Show that o(J) =r I and that o(/) is a direct 
summand of F’. Apply Theorem 8.4.5 to infer that J is projective. For the converse, 
assume that J = R[aj,a2,...,a@,] and let F be the free R-module with basis 
{fi, fo,-.--, fn}. Lete : F — I be the homomorphism given by «(f;) = aj, i = 
1,2,...,n. Show that since J is projective, there isa homomorphism o : J > F 
such that € 0 0 = 1;. For each a € J, write o(a@) = >-a;(q) f; and show that 
for each i = 1,2,...,n the elements G; := a;(a)/a € F are independent of 
0 #£a€/. Next, show that each 3; € J~! and finally that ©) a@; = 1, proving 
that J is invertible. Note, incidentally, that this exercise shows that Dedekind rings 
are hereditary in the sense of Exercise (19) in Sect. 8.5.3. Also, by Exercise 1 of 
this subsection we see that a Noetherian domain is a Dedekind domain if and only 
if every ideal is projective. ] 


. Let R be a Dedekind domain with field of fractions E.If J, J C E are fractional 


ideals, and if 0 4 ¢ € Homa(/, J), prove that ¢ is injective. [Hint: Argue that 
if Jo = im@, then Jo is a projective R-module. Therefore one obtains J = 
ker 6 @ J’, where J =r Jo. Why is such a decomposition a contradiction?] 


. Let R be a Dedekind domain with field of fractions E, and let J, J C E be 


fractional ideals representing classes [/], [J] € Cr, the ideal class group of R. 
If [J] = [J], prove that J =r J. (The converse is also true; see Exercise (9) in 
Sect. 13.13.4 of Chap. 13.) 


9.13.6 Exercises for Sect. 9.12 


1. 


2. 
3 


Let F be a field and let x be an indeterminate. Prove that the ring R = F [x7] 
is not integrally closed, hence is not a UFD. 

Prove the assertion that if a is an algebraic integer, so is /a. 

Let E be a finite field extension of the rational numbers. Suppose 7 is an auto- 
morphism of E. Show that 7 leaves the algebraic integer ring Og invariant. [Hint: 
Since the multiplicative identity element | is unique, show that any automorphism 
of E fixes Z and Q element-wise. The rest of the argument examines the effect 
of such an automorphism on the definition of an algebraic integer. ] 
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10. 


. Show that A(—6) := Z[./—6] is not a UFD. 
. As we have seen, the ring A(—5) = Z[./—5] is not a UFD. Many odd things 


happen in this ring. For instance, find an example of an irreducible element 7 € 
Z{/ —5] and an element a € Z[,/—5] such that 7 doesn’t divide a, but 7 divides 
a’. [Hint: look at factorizations of 9 = 37.] 


. The following result is well-known to virtually every college student. Let f(x) € 


Z[x], and let $ be a rational root of f(x). If the fraction 5 is in lowest terms, then 
a divides the constant term of f(x) and b divides the leading coefficient of f(x). 
If we ask the same question in the context of the ring Z[,/—5], then the answer 
is negative. Indeed, if we consider the polynomial f (x) = 3x? — 2/—5x —3 € 
Z[/—5][x], then the zeroes are 2 #455 and = tas . Since both 3 and +2+./—5 
are non-associated irreducible elements, then the fractions can be considered to 
be in lowest terms. Yet neither of the numerators divide the constant term of f(x). 


. We continue on the theme set in Exercise 6, above. Let D be an integral domain 


with field of fractions #(D). Assume the following condition on the domain D: 


For every polynomial f (x) = aynx" +---+ ag € D[x], with ao, a, 4 0, anda 
Zero 3 € F(D) whichis a fraction in lowest terms—i.e., a and b have no common 
non-unit factors and f (a/b) = 0— then a divides ag and b divides ay. 


Now prove that for such a ring every irreducible element is actually prime. [Hint: 
Let a € D be an irreducible element and assume that 7|uwv, but that 7 doesn’t 
divide either u or v. Let uv = rm, r € D, and consider the polynomial ux? — 
(t+r)x+ve R[x].] 


. Let K bea field such that K is the field of fractions of both subrings Rj, Ro C K. 


Mustit be true that K is the field of fractions of Rj R2? [Hint: A counter-example 
can be found in the rational function field K = F(x).] 


. LetQ C K bea finite-degree extension of fields. Prove that if K is the field of 


fractions of subrings R,, Ro C K, then K is also the field of fractions of R1M R2. 
Again, letQ € K bea finite-degree extension of fields. This time, let {Ra | a € A} 
consist of all the subrings of K having K as field of fractions. Show that K is not 
the field of fractions of Nye4 Ra. (In fact, Nae 4 Ra = Z.) 


Appendix: The Arithmetic of Quadratic Domains 


Introduction 


We understand the term quadratic domain to indicate the ring of algebraic integers 
of a field K of degree two over the rational number field Q. The arithmetic properties 
of interest are those which tend to isolate Euclidean domains: Are these principal 
ideal domains (or PID’s)? Are they unique factorization domains (or UFD’s, also 
called “factorial domains” in the literature)? 
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Quadratic Fields 


Throughout Q will denote the field of rational numbers. Suppose K is a subfield of 
the complex numbers. Then it must contain 1,2 = 1+ 1, all integers and all fractions 
of such integers. Thus K contains Q as a subfield and so is a vector space over Q. 
Then K is a quadratic field if and only of dimg(K) = 2. 

Suppose now that K is a quadratic field, so as a Q-space, 


K =Q@wQ, for any chosenw € K —Q. 


Then w? = —bw —c for rational numbers b and c, so w is a root of the monic 
polynomial 
p(x) =x* +bxt+ce 


in Q[x]. Evidently, p(x) is irreducible, for otherwise, it would factor into two linear 
factors, one of which would have w for a rational root, against our choice of w. Thus 
p(x) has two roots w and w given by 


(-1+Vb2 — 4c)/2. 


We see that K is a field generated by Q and either w or w—a fact which can be 
expressed by writing 
K = Q&) = Q@). 
Evidently 
K = Q(V4d), where d = b? — 4c. 


Observe that substitution of w for the indeterminate x in each polynomial of Q[x] 
produces an onto ring homomorphism, 


vy: Qiu] — K, 


whose kernel is the principal ideal p(x)Q[x].!! But as vg : Q[x] — K has the same 
kernel, we see that the Q-linear transformation 


7: Q06wQ > QOGaQ, 


which takes each vector a + bw toa+bw,a, b € Q, is an automorphism of the field 
K. Thus o(w) = , and o* = 1x (why?). 

Associated with the group (7) = {1x, a} of automorphisms of K are two further 
mappings. 


'I This was the “evaluation homomorphism” of Theorem 7.3.3, p. 207. 
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First is the trace mapping, 
Tr: K >Q, 


which takes each element ¢ of K to ¢ + ¢7. Clearly this mapping is Q-linear. 
The second mapping is the norm mapping, 


N:K>Q, 


which takes each element ¢ of K to ¢¢°. This mapping is not Q-linear, but it follows 
from its definition (and the fact that multiplication in K is commutative) that it is 
multiplicative—that is 


NGY) = NON), 


for all ¢, Win K. 

Just to make sure everything is in place, consider the roots w and w of the irre- 
ducible polynomial p(x) = x? + bx +c. Since p(x) factors as (x — w)(x — w) in 
K[x], we see that the trace is 


Tw)=T@)=—-beEQ, 


and that the norm is 


N(w) = N(@) =c, also inQ. 


Quadratic Domains 


We wish to identify the ring A of algebraic integers of K. These are the elements 
w € K which are roots of at least one monic irreducible polynomial in Z[x]. At this 
stage the reader should be able to prove: 


Lemma 9.A.1 An element ¢ € K — Qis an algebraic integer if and only if its trace 
T (¢) and its norm N(C¢) are both rational integers. 


Now putting ¢ = Jd, where d is square-free and K = Q(V/d), the lemma 


implies: 


Lemma 9.A.2 [fd is a square-free integer, then the ring of algebraic integers of 
K = Qa) is the Z-module spanned by 


{1, Vd} if d=2,3mod4, 
{1,(1+Vd)/2} if d=1mod 4. 


Thus the ring of integers for Q(./7) is the set {a+ bV/7|a, b € Z}. Butifd = —3, 
then (1 + ./—3)/2 is a cube root of unity, and the ring of integers for Q(./—3) is the 
domain of Eisenstein numbers. 
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Which Quadratic Domains Are Euclidean, Principal Ideal 
or Factorial Domains ? 


Fix a square free integer d and let A(d) denote the full ring of algebraic integers of 
K = Q(V4d). A great deal of effort has been devoted to the following questions: 


1. When is A(d) a Euclidean domain? 
2. More generally, when is A(d) a principal ideal domain? 


In this chapter we showed that the Gaussian integers, A(—1) and the Eisenstein 
numbers, A(—3) were Euclidean by showing that the norm function could act as the 
function g : A(d) — Z with respect to which the Euclidean algorithm is defined. In 
the case that the norm can act in this role, we say that the norm is algorithmic. 

The two questions have been completely answered when d is negative. 


Proposition 9.4.1 The norm is algorithmic for A(d) when d = —1, —2, —3,—7 
and —11. For all remaining negative values of d, the ring A(d) is not even Euclidean. 
However, A(d) is a principal ideal domain for 


a= 1, =—2,-3,—7,—1),—19, —43,—67, —163, 


and for no further negative values of d. 


There are ways of showing that A(d) is a principal ideal domain, even when the 
norm function is not algorithmic. For example: 


Lemma 9.A.3 Suppose d is a negative square-free integer and the (positively val- 
ued) norm function of A(d) has this property: 


e If N(x) => NQ) then either y divides x or else there exist elements u and v 
(depending on x and y) such that 


0 < N(ux — yv) < N(y). 


Then A(d) is a principal ideal domain. 


This lemma is used, for example, to show that A(—19) is a 11 B hae 
For positive d (where Q(./d) is areal field) the status of these questions is not so 
clear. We can only report the following: 


Proposition 9.4.2 Suppose d is a positive square-free integer. Then the norm func- 
tion is algorithmic precisely for 


d = 2,3,5, 6,7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73. 


!2See an account of this in Pollard’s The Theory of Algebraic Numbers, Carus Monograph no. 9, 
Math. Assoc. America, p. 100, and in Wilson, J.C., A principal ideal domain that is not Euclidean, 
Mathematics Magazine, vol. 46 (1973), pp. 34-48. 
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The domain A(d) is known to be a principal ideal domain for many further values 
of d. But we dont even know if this can occur for infinitely many values of d! 
Moreover, among these PID’s, how many are actually Euclidean by some function 
g other than the norm function? Are there infinitely many? 


Non-Factorial Subdomains of Euclidean Domains 


Here it is shown that standard Euclidean domains such as the Gaussian integers 
contain subdomains for which unique factorization fails. These nice examples are 
due to P. J. Arpaia (“A note on quadratic Euclidean Domains”, Amer. Math. Monthly, 
vol. 75(1968), pp 864-865). 

Let Ao(d) be the subdomain of A (d) consisting of all integral linear combinations 
of 1 and /d, where d isa square-free integer. Thus Ag(d) = A(d) ifd = 2,3 mod 4, 
and is a proper subdomain of A(d) only if d = 1 mod 4. 

Let p be a rational prime. Let Ao/(p) be the ring whose elements belong to the 
vector space 


(Z/(p))1 ® (Z/(p))Vd, 


where multiplication is defined by setting (/d)? to be the residue class modulo p 
containing d. Precisely, Ag/(p) is the ring Z[x]/J where J is the ideal generated by 
the prime p and the polynomial x* — d. It should be clear that Ag/(p) isa field if d is 
not zero or is not a quadratic residue mod p, and is the direct (ring) sum of two fields 
if d is a quadratic residue mod p. If p divides d, it has a 1-dimensional nilpotent 
ideal. But in all cases, it contains Z/(p) as a subfield containing the multiplicative 
identity element. 
There is a straightforward ring homomorphism 


Np: Ao => Ao/(Pp), 


which takes a + b/d to 4 + b/d where “bar” denotes the taking of residue classes 
of integers mod p. Now set 


B(d, p) =n (Z/(p)). 


Then B(d, p) is a ring, since it is the preimage under , of a subfield of the image 
Ao/(p). In fact, as a set, 


B(d, p) = {a+ bVd € Ao|b = 0 mod p}. 


Clearly it contains the ring of integers and so is a subdomain of A(p). Note that the 
domains Ao and B(d, p) are both o-invariant. 


330 9 The Arithmetic of Integral Domains 


In the subdomain B(d, p), we will show that under a mild condition, there are 
elements which can be factored both as a product of two irreducible elements and 
as a product of three irreducible elements. Since d and p are fixed, we write B for 
B(d, p) henceforward. 


Lemma 9.A.4 An element of B of norm one is a unit in B. 


Proof Since B = B°, N(¢) = ¢¢° = 1 for ¢ € B implies ¢° is an inverse in B for 
Cs 


Lemma 9.A.5 Suppose ¢ is an element Ag whose norm N(C) is the rational prime 
p. The © does not belong to B. 


Proof By hypothesis, N(¢) = a* — b*d = p. If p divided integer b, it would divide 
integer a, so p* would in fact divide N(C), an absurdity. Thus p does not divide b 
and so ¢ is not an element of B. 


Corollary 9.A.1 /f an element of B has norm p?, then it is irreducible in B. 


Proof By Lemma 9.A.4, elements of norm one in B are units in B. So if an element 
of norm p? in B were to have a proper factorization, it must factor into two elements 
of norm p. But the preceding lemma shows that B contains no such factors. 


Theorem 9.A.1 Suppose there exist integers a and b such that a* — db” = p 
(that is, Ag contains an element of norm p). Then the number p> admits these two 
factorizations into irreducible elements of B: 


p> = p-p-p 
p> = (pa+ pbV 4d) - (pa — pbVd). 
Proof The factors are clearly irreducible by Corollary 9.A.1. 


We reproduce Arpaia’s table displaying p = a* — db* for all d for which the 
norm function on A(d) is algorithmic. 


d p=a?—db? d p=a?—db? d p=a?—db? 
-ll 47=6?-(-1)? 5 11=42-5.12 210 -17=27-21-17 
—7 1=2%-(-3)12 6 -5=12-(6)1? 29 —13 = 42 — (26)12 
3 7=22-(-3)27 7 -3=22-3.1? 33 —29 = 22 — 33. 1? 
—2 3=1%- (-2)1? 1 -7=2?-11-12 37. 107 = 122 -37- 12 
-1 2=1-(-) 13. 3=42—13.12 4) ~37=27-41.12 
—-7=12-—2.22 17 —13=27-17-12 57. —53=2?—57.12 


3 -2=17-3.-12 19 -3=42-19. 1? 73 -37=6—73- 12 
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A Student Project 


In the preceding subsection, a non-factorial subdomain B was produced for each 
case that the domain A(d) = alg. int.(Q(/d)) was Euclidean. But there are only 
finitely many of these domains A(d). This raises the question whether one can find 
such a subdomain B in more standard generic Euclidean domains? We know that 
no such subdomain can exist in the domain of integers Z due to the paucity of it 
subdomains. But are there non-factorial subdomains B of the classical Euclidean 
domain of polynomials over a field? 

This might be done by duplicating the development of the previous section replac- 
ing the ring of integers Z, by an arbitrary unique factorization domain D which is 
distinct from its field of fractions k. Suppose further that K is a field of degree two 
over k generated by a root w of an irreducible quadratic polynomial x? — bx — c in 
D{[x] which has distinct roots {w, w}. (One must assume that the characteristic of D 
is not 2.) Then the student can show that, as before, K = k @ wk as a vector space 
over k and that the mapping o defined by 


o(u+wv)=u+wou, forallu,v €k, 


is a field automorphism of K of order two so that the trace and norm mappings T 
and N can be defined as before. 

Now let A be the set of elements in K whose norm and trace are in D. The 
student should be able to prove that if D is Noetherian, A is a ring. In any event, 
Ao := {a + whla, b € D} is a subring. 

Now since D is a UFD not equal to its quotient field, there is certainly a prime p 
in D. It is easy to see that 


By = {at+whbla,be D,be pD} =D@wpD 


is a g-invariant integral domain. Then the proofs of Lemmas 9.A.4 and 9.A.5 and 
Corollary 9.A.1 go through as before (note that the fact that D is a UFD is used here). 
The analogue of the Theorem will follow: 


If Ag contains an element of norm p, B is not factorial. 


Can we do this so that between A and B lies the Euclidean ring F [x]? The answer 
is “yes’’, if the characteristic of the field F is not 2. 

In fact we can pass directly from Fx] to the subdomain B and fill in the roles of 
D,k, K, A, and Ag later. 

Let p(x) be an irreducible polynomial in F[x] where the characteristic of F is 
not 2. Then (using an obvious ring isomorphism) p(x~) is an irreducible element of 
the subdomain F[x*]. Now the student may show that the set 


Bp : F[x?] @ xp(x?) F[x?] 


is a subdomain of F[x]. 
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Now D = F[x?] is a UFD with quotient field k = F(x*)—the so called “field 
of rational functions” over F'. The quotient field K = F(x) of F [x], is degree two 
over k, with the indeterminate x playing the role of w. Thus 


F(x) = F(x’) @ xF(x’) 


since w = x is a zero of the polynomial y? — x? which is irreducible in F [xy] = 
D[y]. Since the characteristic is not 2, the roots x and —x are distinct. Thus we obtain 
our field automorphism o which takes any element ¢ = r(x2) + xs(x?) of F(x), to 
r(x”) —xs(x?) (r ands are quotients of polynomials in F [x2]). (Note that this is just 
the substitution of —x for x in F'(x).) Now the student can write out the norm and 
trace of such a generic ¢. For example 


N(Q) =r? — x’s?. (9.10) 


At this point the ring B, given above is not factorial if there exists a polynomial 
¢ = a(x?) + xb(x7) in F[x] of norm p(x). That is, 


a(x?)? — x7b(x7)* = p(x?) 


is irreducible in F[x?]. 

Here is an example: Take a(x?) = 1 = h(x’), so ¢=x+1.ThenN(1+x)= 
x? — 1 is irreducible in F[x?]. Then in Bp = F[x7] ® x(x? — 1) F[x7], we have the 
factorizations 


(Q—-x?P =(4+x%-—27-—x40 —x -—x? 4+ x3) 


into irreducible elements. 


Chapter 10 
Principal Ideal Domains and Their Modules 


Abstract An integral domain in which every ideal is generated by a single element 
is called a principle ideal domain or PID. Finitely generated modules over a PID are 
completely classified in this chapter. They are uniquely determined by a collection 
of ring elements called the elementary divisors. This theory is applied to two of the 
most prominent PIDs in mathematics: the ring of integers, Z, and the polynomial 
rings F [x], where F is a field. In the case of the integers, the theory yields a complete 
classification of finitely generated abelian groups. In the case of the polynomial ring 
one obtains a complete analysis of a linear transformation of a finite-dimensional 
vector space. The rational canonical form, and, by enlarging the field, the Jordan 
form, emerge from these invariants. 


10.1 Introduction 


Throughout this chapter, the integral domain D (and sometimes R) will be a principal 
ideal domain (PID). Thus it is a commutative ring for which a product of two elements 
is zero only if one of the factors is already zero—that is, the set of non-zero elements is 
closed under multiplication. Moreover every ideal is generated by a single element— 
and so has the form Dg for some element g in the ideal. 

Of course, as rings go, principal ideal domains are hardly typical. But two of their 
examples, the ring of integers Z, and the ring F [x], of all polynomials in a single 
indeterminate x over a field F’, are so pervasive throughout mathematics that their 
atypical properties deserve special attention. 

In the previous chapter we observed that principal ideal domains were in fact 
unique factorization domains. In this chapter, the focus is no longer on the internal 
arithmetic of such rings, but rather on the structure of all finitely-generated submod- 
ules over these domains. As one will soon see, the structure is rather precise. 
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10.2 Quick Review of PID’s 


Principal Ideal Domains (PID’s) are integral domains D for which each ideal J has 
the form J = aD for some element a of D. One may then employ Theorem 8.2.11 
to infer from the fact that every ideal is finitely generated, that the Ascending Chain 
Condition (ACC) holds for the poset of all ideals. 

Recall that a non-zero element a divides element b if and only if 


aD D bD. 


It follows—as we have seen in the arguments that a principal ideal domain (PID) 
is a unique factorization domain (UFD)—that a is irreducible if and only if aD is a 
maximal ideal, and that this happens if and only if a is a prime element (see Lemma 
9.2.4). 
We also recall that for ideals, 
xD = yD 


if and only if x and y are associates. For this means x = yr and y = xs for some 
r ands in D, so x = xrs, so sr = 1 by the cancellation law. Thus r and s, being 
elements of D, are both units. 

The least common multiple of two elements a,b € D is a generator m of the 
ideal 


aD bD = mD 


where m is unique up to associates. 
The greatest common divisor of two non-zero elements a and b is a generator d of 


aD + bD = aD, 


again up to the replacement of d by an associate. Two elements a and b are relatively 
prime if and only if 


aD + bD= D. 
Our aim in this chapter will be to find the structure of finitely-generated modules 


over a PID. Basically, our main theorem will say that if M is a finitely-generated 
module over a principal ideal domain D, then 


M ~ (D/De1) ® (D/Dex) ® --- ® (D/Dex) 


as right D-modules, where k is a natural number and for 1 <i < k — 1, e; divides 
Cj+1- 
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This is a very useful theorem. Its applications are (1) the classification of finite 
abelian groups, and (2) the complete analysis of a single linear transformation T : 
V — V of a finite-dimensional vector space into itself. It will turn out that the set 
of elements e; which are not units is uniquely determined up to associates. But all of 
that still lies ahead of us. 


10.3 Free Modules over a PID 


We say that an R-module M has a basis if and only if there is a subset B of M (called 
an R-basis or simply a basis, if R is understood) such that every element m of M 
has a unique expression 


are, b; € B,r; € R, 


as a finite R-linear combination of elements of B. The uniqueness is the key item 
here. 

Recall that in Chap.8, p. 243, we have defined a free (right) module over the 
ring R to be a (right) R module M satisfying any of the following three equivalent 
conditions: 


e M possesses an R-basis, as defined just above. 

e M is generated by an R-linearly independent set B. (In this case, B is indeed a 
basis). 

e M is adirect sum of submodules M, each of which is isomorphic to Rp as right 
R-modules. 


The equivalence of these conditions was the content of Lemma 8.1.8. 

In general, there is no result asserting that two bases of a free R-module have the 
same cardinality. However, the result is true when R is commutative (see Theorem 
8.1.11), and so it is true for an integral domain D. We call the unique cardinality of 
a basis for a free D-module F,, the D-rank of F and denote it rkp(F) or just rk(F’) 
if D is understood. 


10.4 A Technical Result 


Theorem 10.4.1 Let n and m be natural numbers and let A be ann x m matrix with 
coefficients from a principal ideal domain D. Then there exists ann x n matrix P 
invertible in (D)"*", and anm x m matrix Q, invertible in (D)"*", such that 


PAQ= axel where dj divides dj+1. 
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The elements {d, d2,..., dx}, where k = min(n, m), are uniquely determined up to 
associates by the matrix A. 


Proof For each non-zero element d of D let p(d) be the number of prime divisors 
occurring in a factorization of d into primes. (Recalling that a PID is always a UFD, 
the function p is seen to be well-defined.) We call p(d), the factorization length of 
the element d. 

Our goal is to convert A into a diagonal matrix with successively dividing diagonal 
elements by a series of right and left multiplications by invertible matrices. These 
will allow us to perform several so-called elementary row and column operations 
namely: 


(Il) Add b times the ith row (column) to the jth row (column), i € j. This is 
left (right) multiplication by a suitably sized matrix having b in the (i, j)th 
(or (j, 1)th) position while having all diagonal elements equal to 1p, and all 
further entries equal to zero. 
(ID) Multiply a row (or column) throughout by a unit u of D. This is left (right) 
multiplication by a diagonal matrix diag(1,..., lu, 1,..., 1) of suitable size. 
(II) Permute rows (columns). This is left (right) multiplication by a suitable per- 
mutation matrix. 


First observe that there is nothing to do if A is the n x m matrix with all entries 
zero. In this case each d; = Op, and they are unique. 
Thus for A 4 0, we consider the entire collection of matrices 


S = {PAQ|P, Q units in D”*” and D“*”, respectively}. 
Among the non-zero entries of the matrices in S, there can be found a matrix element 
by (in some B € S) with factorization length p(b;;) minimal. Then by rearranging 
the rows and columns of B if necessary, we may assume that this bj; is in the (1, 1)- 
position of B—that is, (i, 7) = C1, 1). 
We now claim 


bi, divides bi, fork = 2,...,m. (10.1) 


To prove the claim, assume b,; does not divide bj. Permuting a few columns if 
necessary, we may assume k = 2. Since D is a principal ideal domain, we may write 


bj, D+ )12D = Dd. 
Then as d divides bj, and bj2, we see that 
biz = ds and — bi, = dt, withs,t € D. 
Then, as d = bj,;x + bi2y, for some x, y € D, we see that 


d = (—dt)x + (ds)y sol = sy — tx, 
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by the cancellation law. Thus 


(3-)G )=(04)- 


So, after multiplying B on the right by the invertible 


xs 0... 0 
yt 0... 0 
Q,=]00 1 O..., 
00... O 1 
the first row of BQ, is (d, 0, b13,..., Dim). Thus, as d now occurs as a non-zero 


element of D in a matrix in S, we see that p(bj,) < p(d) < p(b11) so that by, is an 
associate of d. Thus b1; divides b12 as claimed. 

Similarly, we see that b,; divides each entry b;, in the first column of B. Thus, 
subtracting suitable multiples of the first row from the remaining rows of B, we 
obtain a matrix in S with first column, (b)1,0,..., 0). Then subtracting suitable 
multiples of the first column from the remaining columns of this matrix, we arrive 
at a matrix 


bi, 00...0 


; B 

0 

where Bo € S. We set By = (cj), where 1 <i <n—landl<j<m—1. 
We now expand on this: we even claim that 


bj, divides each entry cj; of matrix By. 


This is simply because addition of the jth row to the first row of the matrix Bo just 
listed above, yields a matrix B’ in S whose first row is (b11, Cj2, Cj3, +++ Cjm). But we 
have just seen that in any matrix Bo in S containing an element of minimal possible 
factorization length in the (1, 1)-position, that this element divides every element of 
the first row and column. Since B’ is such a matrix, b;; divides Cij- 

Now by induction on 1 + m, there exist invertible matrices P, and Q, such that 
P,B,Q, = diag(d2, d3,...,) with do dividing d3 dividing Ld Augmenting 
P, and Q, by adding a row of zeros above, and then a column to the left whose 


' We apologize for the notation here: diag(d, d3, .. . ,) should not be presumed to be a square matrix. 
Of course it is still an (n — 1) x (m — 1)-matrix, with all entries zero except possibly those that 
bear a d; at the (i, 7)-position. If n < m the main diagonal hits the “floor” before it hits the “east 
wall’, and the other way round if n > m. 
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top entry is 1, the remaining entries 0, one obtains matrices P; and Q‘ such that 
P;} BoQ', = diag(d1, dz, .. .) where we have written d; for b11. 

Let us address the uniqueness. Now the element dj = b,; with minimal prime 
length among the entries in B was completely determined by B. Moreover, the pair 
(b11, B) determined Bo and as well as its submatrix By. Finally, by induction, on 
n-+m, the numbers d2, .. . dx, are uniquely determined up to associates by the matrix 
B,. Thus all of the d; are uniquely determined by the initial matrix B. 


10.5 Finitely Generated Modules over a PID 


Theorem 10.5.1 (The Invariant Factor Theorem) Let D be a principal ideal domain 
and let M be a finitely generated D-module. Then as right D-modules, we have 


M ~ (D/Dd\) ® (D/Daz) ® --- @ (D/Ddn), 


where dj divides dj+1,i =1,...,n—1. 


Proof Let M be a finitely generated D-module. Then by Theorem 8.1.9, there is a 
finitely generated free module F and a D-epimorphism 


f:iFomM. 
Then F has a D-basis {x1,..., X,}, and, as D is Noetherian, its submodule ker f is 
finitely generated. Thus we have 
M=x,;D@xD®:-:--®x,D (10.2) 
ker f = yyD+ yoD+---+ yD. (10.3) 


We then have m unique expressions 
Yi = X1Gi1 + x2G;2 +--+ + Xndin, i = 1,...,m, 
and an m x n matrix A = (aj) with entries in D. By the technical result of the 
previous subsection, there exist matrices P = (pj) and Q = (qj) invertible in 
D"*™ and D"*", respectively, such that 
aq,O0 ...0 
PAQ={0 d2...0] (where dj|dj+1), 


anm x n matrix whose only non-zero entries are on the principal diagonal and pro- 
ceed until they cant go any further (that is, they are defined up to subscript min(m, n)). 
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If we set 
Q7' = (qj), and 
yi = D1 yjpij and 
’ * 
= DiGi 


then PAQ expresses the y; in terms of the x’, which now comprise a new D-basis for 
F. Thus we have 


y, =x1d1, yo = x5do, ... etc, 
(where, by convention the expression dx is taken to be zero for min(m, n) < k < m). 
Also, since P and Q~! are invertible, the y; also span ker f. So, bearing in mind 
that some of the y; = x/d; may be zero, and extending the definition of the dx by 


setting dy := Oifm < k <n, we may write 


F=x}\D@x,D@®---®x,D 
ker f = x}d)D ® xd)D ®--- @ x)d,D. 


Then 


M = f(F) = F/ker f ~ (D/d| D) ® (D/d2D) © --- 8 (D/dnD) 


and the proof of the theorem is complete. 


Remark The essential uniqueness of the elements {d;} appearing in the above the- 
orem will soon emerge in Sect.9.6 below. These elements are called the invariant 
factors (or sometimes the elementary divisors) of the finitely generated D-module M. 

At this stage their uniqueness does not immediately follow from the fact that they 
are uniquely determined by the matrix A since this matrix itself depended on the 
choice of a finite spanning set for ker f and it is not clear that all such finite spanning 
sets can be moved to one another by the action of an invertible transformation Q in 
hom p{ker f, ker f}. 

An element m of a D-module M is called a torsion element if and only if there 
exists a non-zero element d € D such that md = 0. Recall from Exercise (3) in 
Sect. 8.5.1 that the annihilator Ann(m) = {d € D|md = 0} is an ideal and that we 
have just said that m is a torsion element if and only if Ann(m) ¥ 0. In an integral 
domain, the intersection of two non-zero ideals is a non-zero ideal; it follows from 
this that the sum of two torsion elements of an R-module is a torsion element. In 
fact the set of all torsion elements of M@ forms a submodule of M called the torsion 
submodule and denoted torM. 


Corollary 10.5.2 A finitely generated D-module over a principal ideal domain D, 
is a direct sum of its torsion submodule torM and a free module. 
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Proof By the Invariant Factor Theorem (Theorem 10.5.1), 
M = (D/dD) ®--- ®(D/d,D) 


where d; divides dj,1. Choose index f so that i < ¢ implies dj 4 0, and, fori > t 
we have d; = 0. Then 


torM = (D/d,|D) ®---® (D/d,D) 
and so 


M=torM@ Dp @®Dp@®:::-@® Dp, 


with n — ¢ copies of Dp. 


If M = torM, we say that M is a torsion module. 

Let M be a module over a principal ideal domain, D. Let p be a prime element 
in D. We define the p-primary part of M as the set of elements of M annihilated by 
some power of p, that is 


My ={me M|mp* = 0 for some natural number k = k(m)}. 


It is easy to check that M, is a submodule of M. We now have 


Theorem 10.5.3 (Primary Decomposition Theorem) /f M is a finitely generated 
torsion D-module, where D is a principal ideal domain, then there exists a finite 
number of primes Pi, P2,..., PN Such that 


MX My, ®--+ ® Mpy. 


Proof Since M is finitely generated, we have M = x; D + --- xD for some finite 
generating set {x;}. Since M is a torsion module, each of the x; is a torsion element, 
and so all of the ideals Aj := Ann(x;) are non-zero; and since D is a principal ideal 
domain, 


Aj = Ann(x;) = djD, where d; 40,i =1,...,m. 


Also, as D is a unique factorization domain as well, each d; is expressible as a product 
of primes: 
Gl di Gifli) 
di = Piy' Piz * Pigiiy 
for some function f of the index 7. We let {p1,..., px} be a complete re-listing of 


all the primes pj; which appear in these factorizations. If f(i) = 1, then d; is a prime 
power and so x;D € M,, for some j. If, on the other hand, f(i) 4 1, then, forming 
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the elements vj = dj / he we see that the greatest common divisor of the elements 
{vi1,..., Vifa)} is 1, and this means 


vjtD + vj2D + +++ + vg) D = D. 
Thus there exist elements bj, ..., b,j) Such that 

uj1by + vigbz +--+ + vibe) = 1. 
Note that 

bj VijxXi € Mp; since Dj Vj ji = 0, 
so 

5 ee ee buy € Mp, +--+ M pinay: 
Thus in all cases 
x;D © Mp, +--+ + Mpy 
fori = 1,...,m, and the sum on right side is M. Now it remains only to show that 
this sum is direct and we do this by the criterion for direct sums of D-modules (see 
Theorem 8.1.6). Consider an element 
ae (Mp, +---+ My,)O Mp, - 
Then the ideal Ann(a) contains an element m which on the one hand is a product of 
primes in {p1,..., px}, and on the other hand is a power of the prime px+1. Since 
gcd(m, px+1) = 1, we have 
Ann(a) D>mD+ PeyiD = D, 

so 1-a =a = 0. Thus we have shown that 

(Mp, + +--+ Mp) O Mp,,, = (0). 
This proves that 


M = Mp, ® Mp, ©--- ® Mpy, 


and the proof is complete. 
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10.6 The Uniqueness of the Invariant Factors 


We have seen that a finitely generated module M over a principal ideal domain has 
a direct decomposition as a torsion module and a finitely generated free module 
(corresponding to the number of invariant factors which are zero). This free module 
is isomorphic to M/torM and so its rank is the unique rank of M/torM (Theorem 
8.1.11), and consequently is an invariant of M. The torsion submodule, in turn, 
has a unique decomposition into primary parts whose invariant factors determine 
those of tor(M). Thus, in order to show that the so-called invariant factors earn their 
name as genuine invariants of M, we need only show this for a module M which is 
primary—i.e. M = M, for some prime p in D. 
In this case, applying the invariant factor theorem, we have 


M ~ D/p'D®---® D/p*D (10.4) 


where 5; < 52 <-+-- < s;, and we must show that the sequence S = (5s1,..., 5;) is 
uniquely determined by M. We shall accomplish this by producing another sequence 
Q2 = {w;}, whose terms are manifestly invariants determined by the isomorphism 
type of M, and then show that S is determined by &. 

For any p-primary D-module A, set 


Q)(A) := {a € Alap = 0}. 


Then &2;(A) is a D-submodule of A which can be regarded as a vector space over 
the field F := D/pD. We denote the dimension of this vector space by the symbol 
rk(Q2; (A)). Clearly, 82; (A) and its rank are uniquely determined by the isomorphism 
type of A. 

Now let us apply these notions to the module M with invariants S = (s1,..., s;) 
as in Eq. (10.4). First we set wy := rk(QQ](M)), and K; := Q1(M). In general we 
set 


Kj /Ki = Q)(M/K;j), and w+) := rk(Kj41/K;i). 


The K; form an ascending chain of submodules of M which ascends properly until 
it stabilizes at i = s;, the maximal element of S (Note that Ann(M/) = Dp*, so that 
p* can be thought of as the “exponent” of /). Moreover, each of the D-modules K;, 
is completely determined by the isomorphism type of M/K;_1, which is determined 
by K;_1, and ultimately by M alone. Thus we obtain a sequence of numbers 


Q = (W1, W2,..., Ws,), 


which are completely determined by the isomorphism type of M. One can easily see 


(*) w; is the number of elements in the sequence S which are at least as large as j. 
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Now the sequence S can be recovered from Q as follows: 


(**) For 7 = 0,1,...,¢— 1, s;—; is the number of elements w, of Q of cardinality 
at least j + 1. 


Example 48 Suppose p is a prime in the principal ideal domain D and A is a primary 
D-module with invariant factors: 


[ee ee ce LR Ge saa 


so S = (1, 1, 2,3, 3,4, 7, 7, 10). Then, setting F = D/pD, K, is an F-module of 
rank 9, K2/Kj, is an F-module of rank 7, and in general Q = (9, 7, 6, 4, 3,3, 3, 1, 1, 
1). Then S = (1, 1, 2,3, 3,4, 7, 7, 10) is recovered by the recipe in (**). 


We conclude: 


Theorem 10.6.1. The non-unit invariant factors of a finitely-generated module over 
a principal ideal domain are completely determined by the isomorphism type of the 
module—i.e., they are really invariants. 


10.7 Applications of the Theorem on PID Modules 


10.7.1 Classification of Finite Abelian Groups 


As remarked before, an abelian group is simply a Z-module. Since Z is a PID, any 
finitely generated Z-module A has the form 


AX Za, X Za, X +++ X Za, 
—that is, A is uniquely expressible as the direct product of cyclic groups, the order 
of each dividing the next. The numbers (dj, ...d,) are thus invariants of A and the 
decomposition just presented is called the Jordan decomposition of A. The number 
of distinct isomorphism types of abelian groups of order n is thus the number of 
ordered sets (d}, ..., dj) such that d; divides dj; and did) ---d,», =n. 


Example 49 The number of abelian groups of order 36 is 4. The four possible ordered 
sets are: (36), (2, 18), (3, 12), (6, 6). 


This number is usually much easier to compute if we first perform the primary 
decomposition. If we assume A is a finite abelian group of order n, and if p is a 
prime dividing n, then Ap = {a € Alap* = 0, k = 0} is just the set of elements of A 
of p-power order. Thus A p is the unique p-Sylow subgroup of A. Then the primary 
decomposition 

A= Ap, ++ @ Ay, 
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is simply the additive version of writing A as a direct product of its Sylow subgroups. 
Since the Ap, are unique, the isomorphism type of A is uniquely determined by the 
isomorphism types of the Ap;. Thus if Ap, is an abelian group of order p*, we can 
ask for its Jordan decomposition: 


Z ps1 x Z p2 > ee Z psm 


where p* divides p*i+! and p*! p® --- p*" = p*. But we can write this condition as: 
Sj < Sj41 and 51 +52 +---+ 5 = s. Thus each unordered set renders s as a sum of 
natural numbers without regard to the order of the summands—called an unordered 
partition of the integer s. The number of these is usually denoted p(s), and p(s) is 
called the partition function. For example p(2) = 2, p(3) = 3, p(4) = 5, p(5) = 7, 
etc. Here, we see that the number of abelian groups of order p* is just p(s), and that: 


Corollary 10.7.1. The number of isomorphism classes of abelian groups of order 
n= pii--- py is 
P(t) p(t2) -- + p(t). 


Again, we can compute the number of abelian groups of order 36 = 2737 as 
p(2)p(2) = 22 =4. 


10.7.2 The Rational Canonical Form of a Linear 
Transformation 


Let F be any field and let V be an n-dimensional vector space over F so V = F © 
--- ® F (n copies) as an F-module. Suppose T : V — V is a linear transformation 
on V, which we regard as a right operator on V. Now in hom (V, V), the ring of 
linear transformations of V into V, multiplication is composition of transformations. 
Thus for any such transformation 7, the transformations T, T? :=T oT and in fact 
any polynomial in T are well-defined linear transformations of V. 

As is customary, Fx] will denote the ordinary ring of polynomials in an indeter- 
minate x. We wish to convert V into an F[x]-module by defining 


v+ p(x) := up(T) for any v € V and p(x) € F[x]. 
This module is finitely generated since an F’-basis of V will certainly serve as a set 
of generators of V as an F[x]-module. 
It now follows from our main theorem (Theorem 10.5.1) that there are polynomials 


Pi(xX), p2(x),..-, Pm(x) with p; (x) dividing p;+1(x) in F[x] such that 


Vet & F[x]/(pi(x)) @--- ® F[X]/(pm(x)), 


10.7 Applications of the Theorem on PID Modules 345 


where as usual (p;(x)) denotes the principal ideal p; (x) F [x]. Note that none of the 
pi(x) is zero since otherwise Vrjx] would have a (free) direct summand F'[x] Fix] 
which is infinite dimensional over F’. Also, as usual, those p;(x) which are units— 
that is, of degree zero—contribute nothing to the direct decomposition of V. In fact 
we may assume that the polynomials p; (x) are monic polynomials of degree at least 
one. Since they are now uniquely determined polynomials, they have a special name: 
the invariant factors of the transformation T. 
Now since each p;(x) divides p,,(x), we must have that Vp, (x) = 0 and so 


Pm(x) F [x] = Annryxy(V). 


For this reason, P(x), being a polynomial of smallest degree such that p(T) : V > 
0, is called the minimal polynomial of T and denoted my (x). Such a polynomial is 
determined up to associates, but bearing in mind that we are taking p(x) to be 
monic, the word “the” preceeding “minimal polynomial” is justified. 

The product p(x) p2(x)--- Pm(x) is called the characteristic polynomial of T 
(denoted \y7(x)) and we shall soon see that the product of the constant terms of 
these polynomials is, up to sign, the determinant det7’, familiar from linear algebra 
courses. 

How do we compute the polynomials p;(x)? To begin with, we must imagine 
the transformation T : V — V given to us in such a way that it can be explicitly 
determined, that is, by a matrix A describing the action of T on V with respect to a 
basis B = (v1, ..., Un). Specifically, as T is regarded as a right operator,” 


[T]g = A = (ay) 80 T0%;) = 


Vj dij. 
re 


Then in forming the F'[x]-module V7-,,], we see that application of the transformation 


T to V corresponds to right multiplication of every element of V by the indeterminate 
x. Thus, for any fixed row index i: 


Vix = SV vjai 
so 
O = vyayy + +++ + Uj-14j,5-1 + Uj; (ii — X) + Uj 414;,141° ++ UnGin- 
Now as in the proof of the Invariant Factor Theorem, we form the free module 


Fr=x,F[x]®---@x,F[x] 


2Note that the rows of A record the fate of each basis vector under 7. Similarly, if T had been a left 
operator rather than a right one, the columns of A would have been recording the fate of each basis 
vector. The difference is needed only for the purpose of making matrix multiplication represent 
composition of transformations. It has no real effect upon the invariant factors. 
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and the epimorphism f : Fr — V defined by f(x;) = v;. Then the elements 
Vj = Xyjy + +++ + Xj-10j,i-1 + Xi (Gai — X) + 14141141 +++ XnGin. 


all lie in ker f. 
Now for any subscript 7, x times the element x; of Fr is 


xjx =rje+ Six iai. i=l,...,n. 
Thus for any polynomial p(x) € F[x], we can write 
xi p(x) = Di rehin) + Dox 7bj 


for some polynomials hjx(x) € F[x] and coefficients b; € F. 
Now if 


w = xg) (x) + x292(%) +-++ + Xngn(x) 


were an arbitrary element of ker f, then it would have the form given in the following: 


aa si (x) (10.5) 

= (Sri) + De (10.6) 

for appropriate h;(x) € F[x] and c; € F. Then since f(w) = 0 = f(r) and 
f (i) = v; we see that 

f@)=0= fF (Sortie) ie a: (10.7) 

=O+ Dover. (10.8) 


So each c; = 0. This means 
wer F[x]+---+9rp Fl]. 


Thus the elements 71, ..., 7, are a set of generators for ker f. 
Thus we have: 


Corollary 10.7.2 To find the invariant factors of the linear transformation repre- 
sented by the square matrix A = (aj), it suffices to diagonalize the following matrix 
of F[x]@*”, 
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ai—-x ay M13... Ain 

a2) a72—X 73°... AH 

A-Ix= a3 432. 433 —X... A3n 
dnl seh see nee Oy = 2 


by the elementary row and column operations to obtain the matrix 


diag(p1(x), p2(x),.--, Pn(*)) 


with each p;(x) monic, and with p;(x) dividing pj+1(x). Then the p;(x) which are 
not units are the invariant factors of A. 


Now, re-listing the non-unit invariant factors of T as p,(x),..., pr(x),r <n, 
the invariant factor theorem says that T acts on V exactly as right multiplication by 
x acts on 


V’ := F[x]/(pi@)) @ ++: ® F[x]/(pr(x)). 


The space V’ has a very nice basis. Each p;(x) is a monic polynomial of degree 
d; > 0, say 


pi(x) = boi + dix + boix? +++» + bag —1ixt! + x4, 
Then the summand F[x]/(p1(x)) of V’ is non-trivial, and has the F-basis 
1+ (pi(x)), x + ((pilx)), x? + (pi(a)), -.-. x87! + (pix), 
and right multiplication by x effects the action: 


xF + (pi(x)) > x1*! + (pi), 1<j<di-1 (10.9) 
x4" + (pi (x)) > —boi (1 + (pila) — bu + (pi(x))) — +++ - (10.10) 


Thus, with respect to this basis, right multiplication of F[x]/(pi(x)) by x is 
represented by the matrix 


0 1 0 ... 0 
0 1 0... 0 
Cpji(x) = ae fe i ease bs (10.11) 
0 Oo... O 1 
—boi —bii —by «.. —bia;-1i 


called the companion matrix of p;(x). If p;(x) is a unit in F[x]—that is, it is just 
a nonzero scalar—then our convention is to regard the companion matrix as the 
“empty” (0 x 0) matrix. That convention is incorporated in the following 
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Corollary 10.7.3 [f pi(x), ..., Pu(x) are the invariant factors of the square matrix 
A, then A is conjugate to the matrix 


Cox) OO ... 0 
p-AP = O Cpr) 0... 0 


0 sea 0 Con (x) 
This is called the rational canonical form of A. 


Remark Note that each companion matrix Cp,(x) 18 square with d; := deg(p;(x)) 
rows. Using decomposition along the first column, one calculates that Cp,(x) in 
Eq. (10.11) has determinant (—1)%~!(—bo;) = (—1)“‘bo;. Thus the determinant of 
A is completely determined by the invariant factors. The determinant is just the 
product 


rf r 
detA =] (-Df bo = (-D"[],_ bor. 


which is the product of the constant terms of the invariant factors times the parity of 
n = >-dj, the dimension of the original vector space V. 

It may be of interest to know that the calculation of the determinant of matrix A 
by ordinary means (that is, iterated column expansions) uses on the order of 1! steps. 
Calculating the determinant by finding the invariant factors (diagonalizing A — x/) 
uses on the order of n? steps. The “abstract” way is ultimately faster. 


10.7.3 The Jordan Form 


This section involves a special form that is available when the ground field of the 
vector space V contains all the roots of the minimal polynomial of transformation 7. 

Suppose now F is a field containing all roots of the minimal polynomial of a 
linear transformation T : V — V. Then the minimal polynomial, as well as all of 
the invariant factors, completely factors into prime powers (x — ;)“'. Applying the 
primary decomposition to the right F[x]-module V, we see that a “primary part” 
Vix—) of V is a direct sum of modules, each isomorphic to F [x] modulo a principal 
ideal J generated by a power of x — \. This means that any matrix representing 
the linear transformation T is equivalent to a matrix which is a diagonal matrix of 
submatrices which are companion matrices of powers of linear factors. So one is 
reduced to considering F'[x]-modules of the form F[x]/I = F[x]/((x — A)”), for 
some root A. Now F[x]/J has this F-basis: 


LY. = 8 OH ae eo ae 
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Now right multiplying F[x]/J by x = \ + (x — A) takes each basis element to 
A times itself plus its successor basis element—except for the last basis element, 
which is merely multiplied by \. That basis element, (x — \)’"~! + J, and its scalar 
multiples comprise a 1-space of all “eigenvectors” of F [x]/7—that is, vectors v such 
that v7 = Av. The resulting matrix, with respect to this basis has the form: 


A10... 00 
OrA1 0 0 
000... A 1 
000... 0A 


It is called a Jordan block and is denoted Jy, (A). 

The matrix which results from assembling blocks according to the primary de- 
composition, and rewriting the resulting companion matrices in the above form, is 
called the Jordan form of the transformation T. 


Example 50 Suppose the invariant factors of a matrix A were given as follows: 
x, x(x — 1), x(x — 1)°@ — 2)’. 


Here, F is any field in which the numbers 0, | and 2 are distinct. What is the Jordan 
form? 
First we separate the invariant factors into their elementary divisors: 


prime x: {x, x, x}. 
prime x — 2: {(x — 2)7}. 
prime x — 1: {x — 1, (x — 1)%}. 


Next we write out each Jordan block as above: the result being 


(All unmarked entries in the above matrix are zero). 
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10.7.4 Information Carried by the Invariant Factors 


What are some things one would like to know about a linear transformation T : V > 
V, and how can the invariant factors provide this information? 


1. We have already seen that the last invariant factor p, (x) is the minimal polynomial 
of T, that is—the smallest degree monic polynomial p(x) such that p(T) is the 
zero transformation V — Oy. (We recall that the ideal generated by p(x) is 
Ann(Vr x1) when V is converted to an F'[x]-module via T.) 

2. Sometimes it is of interest to know the characteristic polynomial of T, which is 
defined to be the product p; (x) --- p,-(x) of the invariant factors of 7. One must 
bear in mind that by definition the invariant factors of T are monic polynomials 
of positive degree. The characteristic polynomial must have degree n, the F- 
dimension of V. This is because the latter is the sum of the dimensions of the 
F[x]/(pi()), that is, the sum of the numbers deg p; (x) = d;. 

3. Since V has finite dimension, T is onto if and only if it is one-to-one.’ The 
dimension of ker 7 is called the nullity of T, and is given by the number of 
invariant factors p;(x) which are divisible by x. 

4. The rank of T is the F-dimension of T(V). It is just n minus the nullity of T (see 
the footnote just above). 

5. V is said to be a cyclic F [x]-module if and only if there exists a module element 
v such that the set vF [x] spans V—i.e. v generates V as an F[x]-module. This 
is true if and only if the only non-unit invariant factor is p,(x) the minimal 
polynomial (equivalently: the minimal polynomial is equal to the characteristic 
polynomial). 

6. Recall that an R-module is said to be irreducible if and only if there are no proper 
submodules. (We met these in the Jordan-Hélder theorem for R-modules.) Here, 
we say that T acts irreducibly on V if and only if the associated F'[x ]-module is 
irreducible. This means there exists no proper sub-vector space W of V which 
is T-invariant in the sense that 7(W) C W. One can then easily see that T acts 
irreducibly on V if and only the characteristic polynomial is irreducible in F[x].4 

7. An eigenvector of T associated with the root or X-eigenvector is a vector v 
of V such that vf = Xv. A scalar is called an eigen-root of T if and only if 
there exists a non-zero eigenvector associated with this scalar. Now one can see 
that the eigenvectors associated with \ for right multiplication of Vr;x) by x, are 
precisely the module elements killed by the ideal (x — A) F[x]. Clearly they form 
a T-invariant subspace of V whose dimension is the number of invariant factors 
of T which are divisible by (x — A). 


3This is a well-known consequence of the so-called “nullity-rank” theorem for transformations but 
follows easily from the fundamental theorems of homomorphisms for R-modules applied when 
R = F isa field. Specifically, there is an isomorphism between the poset of subspaces of T(V) and 
the poset of subspaces of V which contain ker 7. Since dimension is the length of an unrefinable 
chain in such a poset one obtains codimy (ker 7) = dim(T(V)). 


4This condition forces the minimal polynomial to equal the characteristic polynomial (so that the 
module is cyclic) and both are irreducible polynomials. 
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8. A matrix is said to be nilpotent if and only some power of it is the zero matrix. 
Similarly a linear transformation T : V — V is called a nilpotent transforma- 
tion if and only if some power (number of iterations) of T annihilates V. Quite 
obviously the following are equivalent: 


T is a nilpotent transformation. 
T can be represented by a nilpotent matrix with respect to some basis of V. 
The characteristic polynomial of T is a power of x. 


e 
e 
e 
e The minimal polynomial of T is a power of x. 


10.8 Exercises 


10.8.1 Miscellaneous Exercises 


Bezout Domains 


1. An integral domain is said to be a Bezout domain if and only if all of its finitely- 
generated ideals are principal ideals. Give a proof that an integral domain is a 
Bezout domain if and only if every ideal generated by two elements is a principal 
ideal. 

2. Of course the class of Bezout domains includes all principal ideal domains, so 

it would be interesting to have an example of a Bezout domain which was not a 
principal ideal domain. This and the subsequent four exercises are designed to 
produce examples of Bezout domains which are not principal ideal domains. Our 
candidate will be the integral domain D described in the next paragraph. 
Let E be a PID and let F be the field of fractions of E. Let D be the set of 
all polynomials of F[x] whose constant coefficient belongs to the domain F. 
The first elementary result is to verify that D is a subdomain of F[x]. [The 
warmup result is elementary. But the reader might in general like to prove it using 
Exercise (1) in Sect. 7.5.2 on p. 223, as applied to the evaluation homomorphism 
eg: F[x] > F.] 

3. Let g(x) and h(x) be two non-zero polynomials lying in D. The greatest common 
divisors of these two polynomials are unique up to scalar multiplication. 


(i) Show that if x divides both g(x) and h(x) then 
g(x) D + h(x) D = Dd(x) 


where d(x) is any greatest common divisor of g(x) and h(x). 

(ii) Suppose at least one of the two polynomials is not divisible by x. Then 
any greatest common divisor of g(x) and h(x) has a non-zero constant 
coefficient. We let d(x) be the appropriate scalar multiple of a greatest 
common divisor such that the constant term d(0) is a greatest common 
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divisor of g(0) and (0) in the PID E. (The latter exists since not both of g(0) 
and h(0) are zero). Thus d(x) € D. Moreover, we have g(x) = d(x) p(x) 
and h(x) = d(x)q(x). Then p(0) = (g(0)/d(O)) is an element of E by 
our choice of d(0) so p(x) € D. Similarly g(x) € D. Moreover these 
polynomials p(x) and q(x) are relatively prime in F[x]. Show that 


Dg(x) + Dh(x) = Dd(x) 


if and only if 
Dp(x) + Dq(x) = Dd(0). 


4. From now on we assume p(x) and g(x) elements of D which are relatively prime 
polynomials in F[x]. As above, d will denote a greatest common divisor of p(0) 
and q(0) in the PID E. We set J = Dp(x) + Dgq(x). Show that there exists 
a non-zero element e € E such that ed € I [Hint: Since p(x) and g(x) are 
relatively prime in F[x], there exists polynomials A(x) and B(x) in F[x] such 
that d = A(x)p(x) + B(x)q(x). Then, since F is the field of fractions of the 
domain EF, there exists a non-zero element e € E such that eA(0) and eB(O) are 
both elements of E. Then, observe that eA(x) and eB(x) are elements of D and 
the result follows]. 

5. Show that d € J [Hint: Since d is a greatest common divisor of elements p(0) 
and q(0) in the domain E, there exist elements A and B of E such that d = 
Ap(0) + Bq(0). Write p(x) = p(O) + p’(x) where p’(x) is a multiple of x. 
Noting that p’(x)/ed lies in D and ed lies in I we see that 


p(x) = ed - (p'(x)/ed) 


lies in J, and so p(O) is the difference of two elements of 7. Similarly argue that 
q(0) lies in J so that d = Ap(0) + Bq(0) is also in J]. 

6. Show that J C Dd [Hint: We have p’(x)/d € D (since its constant term is zero) 
and po/d € E C D (since d is a divisor of po in E by its definition). It follows 
that 


p(x) =d- ((p(0)/d) + p'(x)/d) € I. 


Similarly argue that g(x) € J. Thus J C Dd]. 
7. Assemble the preceding problems into steps that show that D is a Bezout domain. 
8. Suppose the domain F is not equal to its field of fractions F. Then it contains a 
prime element p. Show that the ideal J of D generated by the set 


X= {x,x/p,x/p’,x/p°,...} 


is not a principle ideal [Hint: Let J; be the ideal of D generated by the first k 
elements listed in the set X. Show that J) C J2 C --- is an infinite properly 
ascending chain of ideals of D]. 
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9. Consider the following two integral domains: 


(a) The domain D, of all polynomials in Q[x] whose constant term is an integer. 

(b) Let K be a field and let S be the set of non-zero polynomials in K [x], the 
latter being regarded as a subring of K[x, y]. Let K[x, y]s be the localization 
of K[x, y] relative to the multiplicative set S. Observe that K[x, y]s = 
K(x)[y] where K(x) is the so-called “field of rational functions” in the 
indeterminate x. Let D2 be the subdomain of those polynomials p(x, y) € 
K[x, y]s for which p(x, 0) is a polynomial. 


Show that both domains D,; and D2 are Bezout domains which are not principal 
ideal domains. 


10.8.2 Structure Theory of Modules over Z and Polynomial 
Rings 


. Suppose A is an additive abelian group generated by three elements a, b, and c. 
Find the order of A when A is presented by the following relations 


2a+b+c=0 
a+b+c=0 
a+b+2c=0 


2. Answer the same question when 


2a+2b+c=0 
2a+3b+c=0 
2a+4b+c=0 


3. Suppose A is presented by the following relations 


2a+2b+2c=0 
3b+2c=0 
2a+c=0. 


Write A as a direct product of cyclic groups. 
4. Find the invariant factors of the linear transformation which is represented by the 
following matrix over the field of rational numbers: 


2111 
1111 
0000 
1010 
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5. Let V be a vector space over the rational field Q and suppose T : V > Visa 
linear transformation. We are presented with three possibilities for the non-trivial 
invariant factors of T: 


Gy %x7.87° e+ 1). 
(ii) x + 1,x3 +41. 
ii) L+x+x?4+239 = (14+x)(14 x7), 


(a) What is the dimension of V in each of the three cases (i)—(ii1)? 

(b) In which cases is the module V a cyclic module? 

(c) In which cases is the module V irreducible? 

(d) In case (ii) write the rational canonical form of the matrix representing T. 
(e) Write out the full Jordan form of a matrix representing T is case (i). 


6. Suppose t : V — V is a linear transformation of a vector space over the field 
Z/(3) represented by the matrix: 


1100 
0110 
OOo11 
0001 


Find the order of t as an element of the group GL(4, 3). 


Chapter 11 
Theory of Fields 


Abstract If F is a subfield of a field K, then K is said to be an extension of the 
field F. Fora € K, F(a) denotes the subfield generated by F U {a}, and the 
extension F C F(q) is called a simple extension of F. The element a is algebraic 
over F if dimy F'(q) is finite. Field theory is largely a study of field extensions. A 
central theme of this chapter is the exposition of Galois theory, which concerns a 
correspondence between the poset of intermediate fields of a finite normal separable 
extension F C K and the poset of subgroups of Gal (K), the group of automorphis 
ms of K which leave the subfield F' fixed element-wise. A pinnacle of this theory is 
the famous Galois criterion for the solvability of a polynomial equation by radicals. 
Important side issues include the existence of normal and separable closures, the 
fact that trace maps for separable extensions are non-zero (needed to show that rings 
of integral elements are Noetherian in Chap.9), the structure of finite fields, the 
Chevalley-Warning theorem, as well as Luroth’s theorem and transcendence degree. 
Attached are two appendices that may be of interest. One gives an account of fields 
with valuations, while the other gives several proofs that finite division rings are 
fields. There are abundant exercises. 


11.1 Introduction 


In this chapter we are about to enter one of the most fascinating and historically 
interesting regions of abstract algebra. There are at least three reasons why this is so: 


1. Fields underlie nearly every part of mathematics. Of course we could not have 
vector spaces without fields (if only as the center of the coefficient division ring). 
In addition to vector spaces, fields figure in the theory of sesquilinear and mul- 
tilinear forms, algebraic varieties, projective spaces and the theory of buildings, 
Lie algebras and Lie incidence geometry, representation theory, number theory, 
algebraic coding theory as well as many other aspects of combinatorial theory 
and finite geometry, just to name a few areas. 

2. There are presently many open problems involving fields—some motivated by 
some application of fields to another part of mathematics, and some entirely 
intrinsic. 
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3. Third, there is the strange drama that so many field-theoretic questions bear on 
old historic questions, and indeed, its own history has a dramatic story of its 
own. These questions for fields involve some of the oldest and longest-standing 
problems in mathematics—they all involve the existence or non-existence of roots 
of polynomial equations in one indeterminate. 


11.2 Elementary Properties of Field Extensions 


Recall that a field is an integral domain in which every non-zero element is a unit. 
Examples are: 


1. The field of rational numbers—in fact, the “field of fractions” of an arbitrary 
integral domain D (that is, the localization Ds where S$ = D — {0}). A special 
case is the field of fractions of the polynomial domain F [x]. This field is called 
the field of rational functions over F in the indeterminate x (even though they are 
not functions at all). 

2. The familiar fields R and C of real and complex numbers, respectively. 

3. The field of integers modulo a prime—more generally, the field D/M where M 
is a maximal ideal of an integral domain D. 


Of course there are many further examples. 

Let K bea field. A subfield is a subset F of K which contains a non-zero element, 
and is closed under addition, multiplication, and the taking of multiplicative inverses 
of the non-zero elements. It follows at once that F is a field with respect to these 
operations inherited from K, and that F contains the multiplicative identity of K 
as its own multiplicative identity. Obviously, the intersection of any collection of 
subfields of K is a subfield, and so one may speak of the subfield generated by a 
subset X of K as the intersection of all subfields of K containing set X. The subfield 
of K generated by a subfield F and a subset X is denoted F(X).! 

The subfield generated by the multiplicative identity element, 1, is called the prime 
subfield of K. (By definition, it contains the multiplicative identity as well as zero.) 
Since a field is a species of integral domain, it has a characteristic. If the multiplicative 
identity element |, as an element of the additive group (K, +), has finite order, then its 
order is a prime number p which we designate as the characteristic of F. Otherwise 
1 has infinite order as an element of (K,+) and we say that K has characteristic 
zero. We write char(F’) = p or 0 in the respective cases. In the former case, it is 
clear that the prime subfield of F is just Z/pZ,; in the latter case the prime subfield is 
isomorphic to the field Q of rational numbers. One last observation is the following: 


Any subfield F of K contains the prime subfield and possesses the same charac- 
teristic as K. 


‘Of course this notation suffers the defect of leaving K out of the picture. Accordingly we shall 
typically be using it when the ambient overfield K of F is clearly understood from the context. 
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Since we shall often be dealing in this chapter with cases in which the subfield is 
known but the over-field is not, it is handy to reverse the point of view of the previous 
paragraph. We say that K is an extension of F if and only if F is a subfield of K. A 
chain of fields Fy < Fy <.--- < Fy, where each Fj, is a field extension of Fj, is 
called a tower of fields. 

If the field K is an extension of the field F, then K is a right module over F, 
that is, a vector space over F’. As such, it possesses a dimension (which might be an 
infinite cardinal) as an F-vector space. We call this dimension the degree (or index) 
of the extension F < K, and denote it by the symbol [K : F]. 


11.2.1 Algebraic and Transcendental Extensions 


Let K be a field extension of the field F’, and let p(x) be a polynomial in F'[x]. We 
say that an element a of K is a zero of p(x) (or a root of the equation p(x) = 0) 
if and only it p(x) 4 0 and yet p(a) = 0.7 If a is a zero in K of a polynomial in 
F [x], then a is said to be algebraic over F. Otherwise a is said to be transcendental 
over F’. 

If K is an extension of F, and if a is an element of K, then the symbol F[a] 
denotes the subring of K generated by F U {a}. This would be the set of all elements 
of K which are F-linear combinations of powers of a. Note that as this subring 
contains the field F’,, it is a vector space over F' whose dimension is again denoted 
[Fla]: F]. 

There is then a ring homomorphism 


w:F[x]) > Fla] C K 


which takes the polynomial p(x) to the field element p(q) in K. Clearly w(F[x]) = 
Fla]. 

Suppose now ker 7 = 0. Then F[x] ~ F[a] and a is transcendental since any 
non-trivial polynomial of which it is a zero would be a polynomial of positive degree 
in ker ~. 


2Two remarks are appropriate at this point. First note that the definition makes it “un-grammatic” 
to speak of a zero of the zero polynomial. Had the non-zero-ness of p(x) not been inserted in the 
definition of root, we should be saying that every element of the field is a zero of the zero polynomial. 
That is such a bizarre difference from the situation with non-zero polynomials over infinite fields 
that we should otherwise always be apologizing for that exceptional case in the statements of many 
theorems. It thus seems better to utilize the definition to weed out this awkwardness in advance. 

Second, there are natural objections to speaking of a “root” of a polynomial. Some have asked 
whether it might not be better to follow the German example and write “zero” (Nullstelle) for “root” 
(Wurzel), despite usage of the latter term in some English versions of Galois Theory. However we 
shall follow tradition by speaking of a “root” of an equation, p(x) = 0, while we speak of a “zero” 
of a polynomial, p(x). 
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On the other hand, if ker~ 4 0, then, as F'[x] is a principal ideal domain, we 
have ker JW = F[x]p(x) = (p(x)), for some polynomial p(x). Then 


Fla] = Y(F[x]) ~ F[x]/ker by = F[x]/(pQ@)). 


Now, as F [a] is an integral domain, the principal ideal (p(x)) is a prime ideal, 
and so p(x) is a prime element of F[x]. Therefore, since F[x] is a PID, we infer 
immediately that the ideal (p(x)) = ker w is actually a maximal ideal of F[x], and 
so the subring Fa] is a field, which is to say, F[a] = F(a). 

The irreducible polynomial p(x) determined by a is unique up to multiplication by 
anon-zero scalar. The unique associate of p(x) which is monic is denoted Itrp(q). It 
is the the unique monic polynomial of smallest possible degree which has a for a zero. 

Note that in this case, the F-dimension of F [a] is the F-dimension of F[x]/(p(x)), 
which, by the previous chapter can be recognized to be the degree of the polyno- 
mial p(x). 

Now conversely, assume that [F [a] : F'] is finite. Then there exists a non-trivial 
finite F'-linear combination of elements in the infinite list 


{l, a, a’, ae, sinuel 


which is zero. It follows that a is algebraic in this case. 
Summarizing, we have the following: 


Theorem 11.2.1 Let K be an extension of the field F and let a be an element of K. 


1. If ais transcendental over F, then the subring F(a] which it generates is isomor- 
Dhic to the integral domain F [x]. Moreover, the subfield F (a) which it generates 
is isomorphic to the field of fractions F (x) of F(x]. 

2. If ais algebraic over F, then the subring F [a] is a subfield intermediate between 
K and F. As an extension of F its degree is 


[F(a): F] = [Fla]: F] = degItrr(a). 


3. Thus dimr F [a] is finite or infinite according as a is algebraic or transcendental 
over F. 


11.2.2 Indices Multiply: Ruler and Compass Problems 


Lemma 11.2.2 If F < K < L isa tower of fields, then the indices “multiply” in 
the sense that 
[L: F]=[L: K][K: F]. 
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Proof Let A be basis of L as a vector space over K, and let B be a basis of K as a 
vector space over F. Let AB be the set of all products {ab|a € A, b € B}. We shall 
show that AB is an F-basis of L. 

If a € L, then a is a finite K-linear combination of elements of A, say 


a= ayai +--+ + Anam. 
Now each coefficient a;, being an element of K, is a finite linear combination 
aj = ba Bj t++++djmj Bim; 


of elements b;; of B (with the 3; in F). Substituting these expressions for a; in 
the L-linear combination for a, we have expressed a as an F’-linear combination of 
elements of AB. Thus AB is an F-spanning set for L. 

It remains to show that AB is an F’-linearly independent set. So suppose for some 
finite subset S of A x B, that 


Dia mest? Gab =0. 


Then as each {,,p is in F, and each b is in K, the left side of the presented formula 
may be regarded as a K-linear combination of a’s equal to zero, and so, by the 
K-linear independence of A, each coefficient 


> 2bab 


of each a is equal to zero. Hence each 34,5 = 0, since B is F-linearly independent. 
Thus AB is F-linearly independent and hence is an F’-basis for L. Since this entails 
that all the products in AB are pairwise distinct elements, we see that 


|AB| = |A||BI, 


which proves the lemma. 


Corollary 11.2.3 [f Fi < Fy <--- < F, isa finite tower of fields, then 
[Fn Fi] = (Fa: Fn-1) > [Pn-1 + Fn-2) +++ [Fo : Fi). 


We can use this observation to sketch a proof of the impossibility of certain ruler 
and compass constructions. Given a unit length 1, using only a ruler and compass we 
can replicate it m times, or divide it into n equal parts, and so can form all lengths 
which are rational numbers. We can also form right-angled triangles inscribed in a 
circle and so with ruler and compass, we can extract the square root of the differences 
of squares, and hence any square root because of the formula 
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cH? esl 
a : 
2 2 
If @ is such a square root, where c is a rational length, then by ruler and compass 
we can form all the lengths in the ring Q[a], which (since a is a zero of x? —c) is 
a field extension of Q of degree 1| or 2. Iterating these constructions a finite number 


of times, the possible lengths that we could encounter all lie in the uppermost field 
of a tower of fields 


Q=F <hn<::-<h=K 


with each Fj +1 an extension of degree two over F;. By the Corollary above, the index 
[K : Q] is a power of 2. 

Now we come to the arm-waving part of the proof: One needs to know that once 
a field L of constructible distances has been achieved, the only new number not 
in K constructed entirely from old numbers already in K is in fact obtained by 
producing a missing side of a right triangle, two of whose sides have lengths in K. 
(A really severe proof of that fact would require a formalization of exactly what 
“ruler and compass” constructions are—a logical problem beyond the field-theoretic 
applications whose interests are being advanced here.) In this sketch, we assume that 
that has been worked out. 

This means, for example, that we cannot find by ruler and compass alone, a length 
a which is a zero of an irreducible polynomial of degree n where n contains an odd 
prime factor. For if so, Q[a] would be a subfield of a field K of degree 2” over Q. 
On the other hand, the index n = [F[a] : F] must divide [K : Q] = 2”, by the 
Theorem 11.2.1. But the odd prime in n cannot divide 2”. That’s it! That’s the whole 
amazingly simple argument! 

Thus one cannot solve the problem posed by the oracle of Delos, to “duplicate” 
(in volume) a cubic altar—i.e., find a length a such that a? — 2 = 0—at least not 
with ruler and compass. 

Similarly, given angle aq, trisect it with ruler and compass. If one could, then one 
could construct the length cos 3 where 33 = a. But cos(33) = 4cos* 6 — 3cos 3 = 
= cosa. This means we could always find a zero of 4x? — 3x — \ and when 
a = 60°, so \ = 1/2, setting y = 2x yields a constructed zero of y? — 3y — 1, which 
is irreducible over Q. As observed above, this is impossible. 


11.2.3 The Number of Zeros of a Polynomial 


Lemma 11.2.4 Let K be an extension of the field F and suppose the element a of 
K is a zero of the polynomial p(x) in F[x]. Then in the polynomial ring F[a][x], 
x — a divides p(x). 
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Proof Since p(a) = 0 we may write 


P(x) = ao + ax +--+ + ayx" 
= p(x) — p(a) 
= ay (x — a) +.an(x? — a?) +++» Fan (x" — a”) 
= (x= a)[ai + w(x + 0) Heals’? fax+a7y+-: 
paxbi te > Beene 


Theorem 11.2.5 Let K be an extension of the field F and let p(x) be a polynomial 
in F[x]. Then p(x) possesses at most deg p(x) zeros in K. 


Proof We may assume p(x) is a monic polynomial. If p(x) has no zeroes in K, 
we are done. So assume a is a zero of p(x). Then by Lemma 11.2.4 we obtain a 
factorization p(x) = (x — a)p,(x) in K[x], where deg p(x) is one less than the 
degree of p(x). Suppose (3 is any zero of p(x) in K which is distinct from a. Then 


0 = p(@) = (B—a)pi(a). 


Since the first factor on the right is not zero, and F' is a domain, (3 is forced to be a 
zero of the polynomial p; (x). By induction on the degree of p(x), we may conclude 
that the number of distinct possibilities for G does not exceed the degree of pj (x). 
Thus, as any zero of p(x) is either a or one of the zeroes (3 of p(x), we see that the 
total number of zeroes of p(x) cannot exceed 1 + deg p1(x), that is, the degree of 
p(x). 

Of course, when we write p(x) = (x — a)p1(x), it may happen that a is also 
a zero of p1(x), as well. In that case, we can again write p}(x) = (x — a) p2(x), 
and ask whether a is a zero of p2(x). Pushing this procedure as far as we can, we 
eventually obtain a factorization p(x) = (x — ak Px(x) where px (x) does not have 
a for a zero. Repeating this for the finitely many zeroes a; of p(x), one obtains a 
factorization 


P(x) = (x — a)" (& — a2)" + — Om)" 1 (X) (11.1) 


in K[x] where r(x) possesses no “linear factors”—that is, factors of degree one. 
The number 7; is called the multiplicity of the zero a;, and clearly from Eq. (11.1), 

dni < deg p(x). Thus the following slight improvement of Theorem 11.2.5 

emerges: 

Corollary 11.2.6 If K is an extension of the field F and p(x) is a non-zero poly- 

nomial in F [x], then the number of zeroes—counting multiplicities—is at most the 

degree of p(x).° 


3The student should notice that when we count zeroes with their multiplicities we are not doing 
anything mysterious. We are simply forming a multiset of zeroes. The Corollary just says that the 
degree of the polynomial bounds the weight of this multiset. 
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There are a number of other elementary but important corollaries of Theorem 
Ls: 


Corollary 11.2.7 (Polynomial Identities) Suppose a(x), b(x), are polynomials in 
F [x]. Suppose a(a) = b(a) for all a € F. If F contains infinitely many elements, 
then 

a(x) = b(x) 


is a polynomial identity—that is, both sides are the same element of F |x]. 


Proof We need only show that the polynomial J(x) := a(x) — b(x) has degree 
zero. If it has positive degree, then by Theorem 11.2.5 it possesses only finitely many 
zeros. But then by hypothesis, the infinitely many elements of F would all be zeros 
of J(x), a contradiction. 


Corollary 11.2.8 (Invertibility of the Vandermonde matrix) Let n be a positive inte- 


ger. Suppose 21, Z2,.--, Zn ave pairwise distinct elements of a field F. Then then xn 
matrix i ; 
_ 
loz Zo 
2 n—1 
7 1 Z2 2 t+ 25 
2 n—1 
1 wm 2; Zr 


is invertible. 


Proof The matrix M is invertible if and only ifits columns C),..., C;, are F-linearly 
independent. So, if M were not invertible, there would exist coefficients ao, ..., An—1 
in F such that 


n—1 


Sai Ci+1 = [0], then x 1 zero column vector. (11.2) 
i=0 


But each entry on the left side of Eq. (11.2) is p(z;) where p(x) is the polynomial 
ag t+ayx+-:-:- nx"! € F[x]. This conflicts with Theorem 11.2.5, since the 
polynomial p(x) has degree at most n — 1, but yet possesses n distinct zeroes—the z;. 

Thus the columns of M are F-linearly independent and so M is invertible. 


Remark Notice that the proof of Corollary 11.2.8 was determinant-free. Matrices of 
the form displayed in Corollary 11.2.8 are called Vandermonde matrices. 


For the next result, we require the following definition: a group G is said to be 
locally cyclic if and only if every finitely-generated subgroup of G is cyclic. Clearly 
such a group is abelian. Recall from Exercise (2) in Sect.5.6.1 that a group is a 
torsion group if and only each of its elements has finite order. In any abelian group 
A, the set of elements of finite order are closed under multiplication and the taking 
of inverses, and so form a subgroup which we call the torsion subgroup of A. 
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Corollary 11.2.9 Let F be any field and let F* be its multiplicative group of non- 
zero elements. Then the torsion subgroup of F* is locally cyclic. 


Proof Any finitely-generated subgroup of the torsion subgroup of F* is a finite 
abelian group A. If A were not cyclic, it would contain a subgroup isomorphic to 
Z p x Zp, for some prime p. In that case, the polynomial x? — 1 would have at least 
p* zeros in F, contrary to Theorem 11.2.5. 


11.3 Splitting Fields and Their Automorphisms 
11.3.1 Extending Isomorphisms Between Fields 


Suppose E is a field extension of F and that a is an element of FE which is algebraic 
over F. Then the development in the previous section (p. 395) showed that there is 
a unique irreducible monic polynomial, 


g(x) = Irrr(a) € Fix], 


having a as a zero. The subfield F(a) of E generated by F U {a}, is the subring 
F [a], which we have seen is isomorphic to the factor ring F[x]/(g(x)). 

We remind the reader of a second principle. Suppose f : F; — F2 is an isomor- 
phism of fields. Then f can be extended to a ring isomorphism 


fF] > 1; 
which takes the polynomial 


P= ag tayx +--+ + anx" 


F*(p) = Fao) + flax + +++ + fGn)x". 


It is obvious that f* takes irreducible polynomials of F\[x] to irreducible poly- 
nomials of F>[x]. 
With these two principles in mind we record the following: 


Lemma 11.3.1 (Fundamental Lemma on Extending Isomorphisms) Let E be a field 
extension of F, and suppose f : F > E is an embedding of F as a subfield F of the 
field E. Let f, be the induced isomorphism f, : F — F obtained by resetting the 
codomain of f. Next let f{* be the extension of this field isomorphism f, to a ring 
isomorphism F[x] — F[x], as described above. 

Finally, let a be an element of E which is algebraic over F, and let g be its monic 
irreducible polynomial in F(x]. 
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Then the following assertions hold: 


(i) The embedding (injective morphism) f : F > E can be extended to an embed- 
ding : 7 
f: F(@)—> E, 


if and only if the polynomial f¥(g) := g has a zero in E. 


(ii) Moreover, for each zero & of g in E, there is a unique embedding a : 
F(a) > E, taking a to @ and extending f. 
(iii) The number of extensions f: F(a) > E of the embedding f is equal to the 


number of distinct zeros of g to be found in E. 


Proof (i) If there is an embedding 7 : F(a) > E extending the embedding /, then 
fla) is a zero of g in E. 

Conversely, if a in E, then F(d) is a subfield of E isomorphic to F[x x]/ (G(x). 
Similarly, F(a) is isomorphic to F[x]/(g(x)). But the ring isomorphism /f/* 
F[x] > F [x] takes the maximal ideal (g(x)) of F[x] to the maximal ideal (g(x)) 
of F[x], and so induces an isomorphism of the corresponding factor rings: 


fs Flxl/(g@)) > Fixl/(G(a)). 


Linking up these three isomorphisms 


F(a) > F[x]/(g@)) > 4 Fey Go) +> F(@) CE, 


yields the desired embedding. 
(ii) If there were two embeddings h,, h2 : F(a) > E, taking a to a and extending 
f, then the composition of the inverse of one with the other would fix a, would fix 
F element-wise, and so would be the identity mapping on F(Q). Thus, for any 
Be F(a), 
hy (8) = (hz o hy ')(hy (3) = ho). 


Thus /; and hz would be identical mappings. 
(iii) This part follows from parts (i) and (ii). 


11.3.2 Splitting Fields 


Suppose F is an extension field of F. Then, of course, any polynomial in F[x] can be 
regarded as a polynomial in E[x], and any factorization of itin F'[x] is a factorization 
in E[x]. Put another way, F[x] is a subdomain of E[x]. 

A non-zero polynomial p(x) in Fx] of positive degree is said to split over E if 
and only if p(x) factors completely into linear factors (that is, factors of degree 1) 
in E[x]. Such a factorization has the form 
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P(x) = a(x — ay)" (x — az)"? «++ (x — a)” (11.3) 


where a is a nonzero element of F, the exponents n; are positive integers and the 
collection {a1,...,a,} of zeros of p(x) in E is contained in E. 

Obviously, if E’ is a field containing E and the polynomial p(x) splits over E, 
then it splits over E’. Also, if Eo were a subfield of E containing F together with all 
of the zeros {a1,..., a,}, then p(x) would split over over Eo, since the factorization 
in Eq. (11.3) can take place in Eg[x]. In particular, this is trueif Eg = F(a),...,a,), 
the subfield of E generated by F and the zeros aj,i = 1,...,7r. 

An extension field E of F is called a splitting field of the polynomial p(x) € F[x] 
over F if and only if 


(Sl) FCE. 
(S2) f(x) factors completely into linear factors in E[x] as 


f (x) = a(x — a1)(x — az)- ++ (% — an). 


(S3) E = F(aj,..., a), that is, E is generated by F and the zeros qj. 


The following observations follow directly from the definition just given and the 
proofs are left as an exercise. 


Lemma 11.3.2. The following hold: 


(i) If E is a splitting field for p(x) € F[x] over F, and L is a subfield of E 
containing F, then E is a splitting field for p(x) over L. 

(ii) Suppose p(x) and q(x) are polynomials in F |x]. Suppose E is a splitting field 
for p(x) over F and K is a splitting field for q(x) over E. Then K is a splitting 
field for p(x)q(x) over F. 


Our immediate goals are to show that splitting fields for a fixed polynomial exist, 
to show that they are unique up to isomorphism, and to show that between any two 
such splitting fields, the number of isomorphisms is bounded by a function of the 
number of distinct zeros of f(x). 


Theorem 11.3.3 (Existence of Splitting Fields) If f(x) € F[x], then a splitting 
field for f (x) over F exists. Its degree over F is at most the factorial number d}, 
where d is the sum of the degrees of the non-linear irreducible polynomial factors of 


f(x) in F[x]. 


Proof Let f(x) = fi(x) fo(x)--- fin(x) be a factorization of f(x) into irreducible 
factors in F[x]. (This is possible by Theorem 7.3.2, Part 3.) Set n = deg f(x), and 
proceed by induction on k = n — m. If k = 0, then n = m, so each factor fj (x) 
is linear and clearly E = F satisfies all the conclusions of the theorem. So assume 
k > 0. Then some factor, say f1(x), has degree greater than 1. Form the field 


L= Fl[x)/(fi()), 


366 11 Theory of Fields 


(this is a field since /;(x) is irreducible and it is an extension of F since it is a 
vector-space over F’). Observe that the coset a := x + F[x] f(x) is a zero of fi (x) 
(and hence f(x)) in the field L. Thus in L[x], we have the factorizations: 


fix) = & — ajhi (x), 


and 


f(x) = (« — a)hy (x) fo) +++ fin), 


with € > m irreducible factors. Thus n — £ < n — m and so by induction, there is a 
splitting field E of f(x) over L. Moreover, this degree is at most (d — 1)! since the 
sum of the degrees of the non-linear irreducible factors of f(x) over L[x] has been 
reduced by at least one because of the appearance of the new linear factor x — a. 

Now we claim that £ is a splitting field of f(x) over F. We must verify the 
three defining properties of a splitting field given at the beginning of this subsec- 
tion (p. 365). 

Property (S1) holds since L C E implies F C E. We already have (S2) from the 
definition of F. It remains to see that E is generated by the zeros of f. Since (x — a) 
is one of the factors of f(x) in the factorization 


F(x) = (« — ay) (X — az) +++ (% — a) 


in E[x], we may assume without loss of generality that a = a,. We have E = 
L(az,...,@p) from the definition of E. But L = F(a,), so 


E= L(a2,..., Am) = F(a1)(@2, -.-, An) = F(a,..-,4n), 
and so (S3) holds. Thus E is indeed a splitting field for f(x) over F. 


Finally, since [L : F] = deg f(x) < d, and[E : L] < (d — 1)! we obtain 
[E: F]}=[E: L][L: F] < (d-1)!-d =d! as required. 


Theorem 11.3.4 Let: F — F be an isomorphism of fields, which we extend to a 
ring isomorphism 7* : F[x] > Fix], and let us write f(x) for n*(f (x)) for each 
f(x) € F[x]. Suppose E and E are splitting fields of f (x) over F, and f (x) over 
F, respectively. Then n can be extended to an isomorphism 7 of E onto E, and the 
number of ways of doing this is at most the index [E : F]. If the irreducible factors of 
F (x) have no multiple zeros in E, then there are exactly [E : F] such isomorphisms. 


Proof We use induction on [E : F']. If [E : F] = 1, f(x) factors into linear factors, 
and there is precisely one isomorphism extending 7), namely 17 itself. 

Assume [FE : F] > 1. Then f(x) contains an irreducible factor g(x) of degree 
greater than 1. Since 7* is an isomorphism of polynomial rings, g(x) is an irreducible 
factor of f (x) with the same degree as g(x). Let a € E be a zero of g(x), and 
set K = F(a). Since the irreducible polynomial g(x) splits completely in E, we 
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may apply the Fundamental Lemma on Extensions (Lemma | 1.3.1), to infer that the 
isomorphism 77 has k extensions ¢1,..., 4; : F(a) > E, where k is the number of 
distinct zeros of g(x) in E. 

Note that if m = deg g(x), then [K : F] =m andk < m. If the zeros of g(x) are 
distinct, k = m. 

Now clearly E is a splitting field of f(x) over K, and E is a splitting field of 
#@) over each ¢;(K). Since [E : K] < [E : F], induction implies that each ¢; can 
be extended to an isomorphism E —> E in at most [E : K] ways, and in exactly 
[E : K] ways if the irreducible factors of f(x) € K[x] have distinct zeros in E. 
However if the irreducible factors of f(x) € F[x] have distinct zeros in E, the same 
is obviously true for the irreducible factors of g(x) € K[x]. This yields at most 
kK[E : K] = [K: F][E : K] = [E: F] isomorphisms in general, and exactly 
[E : F] isomorphisms if the irreducible factors of f(x) have distinct zeros in E. 

This proves the theorem. 


Corollary 11.3.5 Jf E, and E, are two splitting fields for f(x) € Fx] over F, 
there exists an F -linear field isomorphism: 


o:E, > E>. 


Proof This follows immediately from Theorem 11.3.4 for the case o = 1p, the 
identity mapping on F = F| = Fy. 


Example 51 Suppose F = Q, the rational field, and p(x) = x* — 5S, irre- 
ducible by Eisenstein’s criterion. One easily has that the complex zeros of p(x) 
are +./5, +i75 € C, where, of course, i is the imaginary complex unit (i2 =-1), 
and 4/5 is the real fourth root of 5. This easily implies that the splitting field E C C 
can be described by setting E = Q(i, 5). Since P(x) € Q[x] is irreducible by 
Eisenstein, we conclude that [Q(/5) : Q| = 4. Since E is the complex splitting 
field over Q(W/5) of the polynomial x7 + 1 € Q(75)[x], and since Q(¥/5) ¢ E,we 
infer that [E : Q(/5)] = 2, giving the splitting field extension degree: 


LE: QI=[E : WV5)] - (Q5) : Q)=2-4=8. 


This shows that the bound [E : F] < d! need not be obtained. Note, finally, by 
Theorem 11.3.4, that there are exactly 8 = [E : Q] distinct Q-automorphisms of E. 


We close this subsection with a useful observation: 


Lemma 11.3.6 Suppose K is an extension of a field F, and that K contains a 
subfield E which is a splitting field for a polynomial p(x) € F[x] over F. Then any 
automorphism of K fixing F point-wise must stabilize E. 


Remark The Lemma just says that the splitting field E is “characteristic” among 
fields in K which contain F—that is, E is invariant under all F-linear automorphisms 
of K. 
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Proof It suffices to note that if o is an F-linear automorphism of K, then for any 
zero a of p(x) in K, o(q) is also a zero of p(x) in K. Since E is generated by F 
and the zeroes of f(x) in K, we have E = E’, for all F-automorphisms o of K. 


11.3.3 Normal Extensions 


So far the notion of splitting field is geared to a particular polynomial. The purpose 
of this section is to show that the polynomial-splitting property can be seen as a 
property of a field extension itself, independent of any particular polynomial. 

Before proceeding further, let us streamline our language concerning field auto- 
morphisms. 


Definition Let K and L be fields containing a common subfield F’. An isomorphism 
K — Lof K onto L is called an F-isomorphism if and only if it fixes the subfield 
F element-wise. (Heretofore, we have been calling these F'-linear isomorphisms.) 
Of course, if K = L, an F-isomorphism, K — L, is called an F-automorphism. 
Finally, an F-isomorphism of K onto a subfield of L is called an F-embedding. 


We say that a finite extension E of F is normal over F if and only if E is the 
splitting field over F of some polynomial of F[x]. Just as a reminder, recall that this 
means that there is a polynomial p(x) which splits completely into linear factors in 
E[x] and that E is generated by F and all the zeros of p(x) that lie in E. 

Note that if EF is a normal extension and K is an intermediate field,—that is, 
F< K < E—then E£ is anormal extension of K. 

We have a criterion for normality. 


Theorem 11.3.7 (A characterization of normal field extensions*) The following are 
equivalent for the finite extension F C E: 


(i) E is normal over F; 
(ii) every irreducible polynomial g(x) € F[x] having a zero in E must split com- 
pletely into linear factors in E[x]. 


Proof Suppose E is normal over F, so that E is the splitting field of a polynomial 
f(x) € F[x]. Let g(x) be an irreducible polynomial in F [x] with a zero a in E. Let 
K D> E be asplitting field over E for g(x) and let b be an arbitrary zero of g(x) in K. 
Since g(x) is irreducible in F [x], there is anisomorphismo : F(a) — F(b) whichis 
the identity mapping when restricted to F. Furthermore, F is clearly a splitting field 
for f(x) over F(a); likewise E(b) is a splitting field for f(x) over F(b). Therefore, 
we may apply Theorem 11.3.4, to obtain an isomorphism 7 : E — E(b) extending 
a : F(a) —> F(b). In particular, this implies that [E : F(a)] = [E(b) : F(b)]. 


4In many books, the characterizing property (ii) given in this theorem is taken to be the definition 
of “normal extension’. This does not alter the fact that the equivalence of the two distinct notions 
must be proved. 
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Since [F (a): F] = [F(b) : F], it follows that [E : F] =[E(b): FJ], andsobe E, 
forcing E to be a normal extension of F. 

Next, suppose that the extension F C E satisfies condition (ii). Since E is a finite 
extension of F, it has an F-basis, say {w1,..., Un}. Set gj(x) = Itrr(u;), and let 
p(x) be the product of g1(x), g2(x), ..-, Gn(x). Then by hypothesis, every zero of 
p(x) isin E and p(x) splits into linear factors in E[x]. Since the {u;} are among these 
zeros and generate E over F, we see a fortiori that the zeros of p(x) will generate 
E over F. That is, EF is a splitting field for p(x) over F. 


Corollary 11.3.8 Jf E is a (finite) normal extension of F and K is any intermediate 
field, then any F-embedding 
o: KE. 


can be extended to an automorphism of E fixing F element-wise. 


Proof E is a splitting field over F for a polynomial f(x) € F[x]. Thus E is a 
splitting field for f(x) over K as well as over o(K). The result then follows from 
Theorem 11.3.4. 


Suppose K//F isa finite extension. Then K = F(a, ..., @,) for some finite set of 
elements {a;} of K (for example, an F-basis of K). As in the proof of Theorem 11.3.7 
we let p(x) be the product of the polynomials g;(x) := Itrr(a;), i = 1,2,...,7. 
Now any normal extension E > F capturing K as an intermediate field must contain 
every zero of g; (x) and hence every zero of p(x). Thus, between E and K there exists 
a splitting field L of p(x) over F. The splitting field Z is the “smallest” normal 
extension of F containing K in the sense that any other finite normal extension 
E' > F which contains K also contains an isomorphic copy of L—that is, there is 
an embedding L — E’ whose image contains K. Since this global description of 
the field L is independent of the polynomial p(x), we have the following: 


Corollary 11.3.9 (Normal Closure) /f K is a finite extension of F, then there is a 
normal extension L of F, unique up to F-isomorphism, containing K and having 
the property that for every normal extension E > F containing K, there is an 
F-isomorphism of L onto a subfield of E which is normal over F and contains K. 


The extension L of F so defined is called the normal closure of K over F. The 
reader should bear in mind that this notion depends critically on F’. For example it 
may happen that K is not normal over F’,, but is normal over some intermediate field 
N. Then the normal closure of K over N is just K itself while the normal closure of 
K over F could be larger than K. 

Another consequence of Corollary 11.3.8 is this: 


Corollary 11.3.10 (Normality of invariant subfields) Let K be a normal extension 
of the field F of finite degree, and let G be the full group of F-automorphisms of 
K. If L is a subfield of K containing F, then L is normal over F if and only if it is 
G-invariant. 
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Proof If L is normal over F,, then L is the splitting field for some polynomial 
f(x) € F[x]. But the zeroes of f(x) in K are permuted among themselves by 
G, and so the subfield over F that they generate, namely L, is G-invariant. 

On the other hand, assume L is G-invariant. Suppose p(x) is an irreducible poly- 
nomial in F [x] with at least one zero, a in L. Suppose (3 were another zero of p(x) 
in K. Then there is an F-isomorphism F [a] + F[(] of subfields of K, which, by 
Corollary 11.3.8, can be extended to an element of G. Since L is G invariant, @ € L. 
Thus p(x) has all its zeros in L and so splits completely over L. By Theorem 11.3.7, 
L is normal over F. 


11.4 Some Applications to Finite Fields 


Suppose now that F is a finite field—that is, one which contains finitely many 
elements. Then, of course, its prime subfield P is the field Z/ pZ, where p is a prime. 
Furthermore, F is a finite-dimensional vector space over P—say, of dimension n. 
This forces |F| = p” = q, a prime power. 

It now follows from Corollary 11.2.9 that the finite multiplicative group F* of 
non-zero elements of F is a cyclic group of order gq — 1. This means that F contains 
all the g — 1 zeroes of the polynomial x7—! — 1 in P[x]. Since 0 is also a root of the 
equation x7 — x = 0, the following is immediate: 


Lemma 11.4.1 F is a splitting field of the polynomial x4 — x € P[x]. 


It follows immediately, that F is uniquely determined up to P-isomorphism by 
the prime-power q alone. 


Corollary 11.4.2 For any given prime-power q, there is, up to isomorphism, exactly 
one field with q elements. 


One denotes any member of this isomorphism class by the symbol GF (q). 


11.4.1 Automorphisms of Finite Fields 


Suppose F is a finite field with exactly g elements, that is, F ~ GF(q). We have seen 
that char (F') = p, a prime number, where, for some positive integer n, g = p”. 
We know from Lemma/7.4.6 of Sect.7.4.2, p. 221 that the “pth power mapping” 
o : F — F is aring endomorphism. Its kernel is trivial since F contains no non- 
zero nilpotent elements. Thus o is injective, and so, by the pigeon-hole principle, 
is bijective—that is, it is an automorphism of F. Finally, since the multiplicative 
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subgroup P* of the prime subfield P = Z/pZ has order p—1, it follows immediately 
that a?—! = 1 for all0 4a € P. This implies that a? = a for alla € P, and hence 
the above p-power automorphism of F is a P-automorphism.° 

Now suppose pth power mapping, o, had order k. Then, a?" = a forall elements 
a of F. But as there are at most pk roots of xP — x = 0, we see that k > n. On the 
other hand, we have already seen above that a” is the identity on F, forcing k <n. 
Thus k = n and o generates a cyclic subgroup of Aut(F) of order n. 

We proceed now to show that, in fact, (o) = Aut(F). Let 7 be an arbitrary 
automorphism of F, and fix a generator 6 of the multiplicative group F*. Then there 
exists an integer t ,0 < ¢ < g—1,suchthat7(@) = 0’. Then, as 7 is an automorphism, 
7(0') = (6')'. Since also 0' = 0 = 7(0), it follows that the automorphism 7 is a 
power mapping a — a! on F. Next, since rT must fix 1 and preserve addition, one has 


al +1 = (at if sal pratt (4A be bt 


Thus, if ¢ > 1, all of the g — 1 elements a in C are roots of the equation 


t-1 t k 
pa a) = 0. (11.4) 


If the polynomial on the left were not identically zero, its degree t — | would be at 
least as large as the number of its distinct zeros which is at least g — 1. That is outside 
the range of t. Thus each of the binomial coefficients in the equation is zero. Thus 
either p divides ¢ ort = 1. 

Now if t = p* - s, where s is not divisible by p, we could apply the argument of 
the previous paragraph to p := 7 , to conclude that s = 1. Thus ¢ is a prime power 
and so 7 is a power o. 

We have shown the following: 


Corollary 11.4.3 [f F is the finite field GF(q), q = p", then the full automorphism 
group of F is cyclic of order n and is generated by the pth power mapping. 


Remark The argument above, that Aut(F’) = (c), could be achieved in one stroke 
by the “Dedekind Independence Lemma;” see Sect. 1 1.6.2. However, the above argu- 
ment uses only the most elementary properties introduced so far. 


We see at this point that [F : P] = |Aut(F)|. In the language of a later section, we 
would say that F is a “Galois extension” of P. It will have very strong consequences 
for us. For one thing, it will eventually mean that no irreducible polynomial of positive 
degree in GF(q)[x] can have repeated zeros in some extension field. 


> Actually, it’s pretty easy to see directly that if E is any field and P is its prime subfield, then any 
automorphism of E is a P-automorphism. 
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11.4.2 Polynomial Equations Over Finite Fields: 
The Chevalley-Warning Theorem 


The main result of this subsection is a side-issue in so far as it doesnt really play a 
role in the development of the Galois theory. However, it represents a property of 
finite fields which is too important not to mention before leaving a section devoted 
to these fields. The student wishing to race on to the theory of Galois extensions and 
solvability of equations by radicals may safely skip this section, hopefully for a later 
revisitation. 

For this subsection, fix a finite field K = GF(q) of characteristic p and cardinality 
q = p*. Now for any natural number k, let 


S(k, gq) = Da 


the sum of the kth powers of the elements of the finite field K of g elements. Clearly, 
if k = 0, the sum is S(0, g) = gq = 0, since the field has characteristic p dividing q. 
If k is a multiple of g — 1 distinct from 0, then as K* is a cyclic group of order g — 1, 


S(k,q) = pee =q-1=-1. 


Now suppose k is not a multiple of g — 1. Then there exists an element b in K* 
with b« + 1. Since multiplying on the left by b simply permutes the elements of K, 
we see that 


SK D = Deg = Donen CO" = USK, 9). 


So (bk — 1)S(k, gq) = 0. Since the first factor is not zero, we have S(k,q) = Oin 
this case. 
We summarize this in the following 


Lemma 11.4.4 
0 ifk=0 
S(k,q) = 34-1 ifq —1 dividesk 
0 otherwise 


Now fix a positive integer n. Let V = K), the vector space of n-tuples with 
entries in K. For any vector v = (aj, ...,d,) € V and polynomial 


P= p(x1,-.-Xn) € K[x1,..., Xn], 


the ring of polynomials in the indeterminates x;,...x,, we let the symbol p(v) 
denote the result of substituting a; for x; in the polynomial, that is, 


P(v) := p(a1,..-,4n). 
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If p(v) = 0 for a vector v € V, we say that v is a zero of the polynomial p. 


Now suppose p = eo ..- x" is a monomial of degree ki < n(q — 1). Then 


Dev? = TE 8& a = 9, 


since at least one of the exponents k; is less than gq — | and so by Lemma 11.4.4 
introduces a factor of zero in the product. Since every polynomial p in K[x1,..., Xn] 
is asum of such monomials of degree less than n(g — 1), we have 


Lemma 11.4.5 [fp € K[x1,..., Xn] has degree less than n(q — 1), then 


S(p) = Di P@) = 0. 
Now we can prove the following: 


Theorem 11.4.6 (Chevalley-Warning) Let K be a finite field of characteristic p with 
exactly q elements. Suppose P\,..., Pm is a family of polynomials of K[x1,..., Xn], 
the sum of whose degrees is less than n, the number of variables. Let X be the 
collection 

{ve K™ |p;(v) =0,i =1,...,m} 


of common zeroes of these polynomials. Then 
|X| = 0 mod p. 


Proof Let R := [J2,0 — pt). Then R is a polynomial whose degree (q — 1) 
dideg p; is less than n(g — 1). Now if v € X, then v is a common zero of the 
polynomials p;, so R(v) = 1. But if v ¢ X, then for some i, p;(v) 4 0, so 
pi(v)i—! = |, introducing a zero factor in the definition of R(v). Thus we see that 
the polynomial R induces the characteristic function of X¥—that is, it has value | on 
elements of X and value 0 outside of X. It follows that 


S(R) := ev k® = |X| mod p. (11.5) 


But since deg R < n(qg — 1), Lemma11.4.5 forces S(R) = 0, which converts 
(11.5) into the conclusion. 


Corollary 11.4.7 Suppose p,,...Pm is a collection of polynomials over K = 
GF(q) in the indeterminates x\,..., Xn, each with a zero constant term. Then there 
exists a common non-trivial zero for these polynomials. (Non-trivial means one that 
is not the zero vector of V.) 

In particular, if p is a homogeneous polynomial over K in more indeterminates 
than its degree, then it must have a non-trivial zero. 
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11.5 Separable Elements and Field Extensions 


11.5.1 Separability 


A polynomial p(x) € F[x] is said to be separable if and only if its irreducible factors 
in F'[x] each have no repeated zeros in any splitting field. 

Our discussion on separability of polynomials will be greatly facilitated by the 
following concept. Let F be a field and define the formal derivative 


0: Fix) > Fix], f@)Pr f(x) 


by setting 


n n 
f' (x) = 5 kagx*!, whenever f(x) = 7 axx*. 
k=0 k=0 


The formal derivative satisfies the familiar “product rule:” O(f(x)g(x)) = 
(Of (x))g(x) + f (x)Og(x), and hence, its generalization, the “Leibniz rule”: 


O( fix) f2(x) +++ fr(x)) = > Fila) +++ fi-1@)Ofi @)) fiz @) +++ f-@). 
i=l 


Furthermore, the formal derivative is “independent of its domain” inasmuch as if 
F C Eis an extension of fields, then the following diagram commutes: 


where the vertical arrows are obvious inclusions. 
The following simple result will be useful in the sequel. 


Lemma 11.5.1 Let f(x), g(x) € F[x], and let F C E be a field extension. Then 
F (x) and g(x) are relatively prime in Fx] if and only if they are relatively prime in 
E[x]. 


Proof Tf f(x), g(x) are relatively prime in F[x], then there exist polynomials 
s(x), t(x) € F[x] with s(x) f(x) + t(x) g(x) = 1. Since this equation is obviously 
valid in E[x] we infer that f(x), g(x) are relatively prime in E[x], as well. 

If f(x) and and g(x) are not relatively prime in F[x], their greatest common 
divisor in F [x] has positive degree, and so this is also true in E[x]. Thus not being 
relatively prime in F'[x] implies not being relatively prime in E[x]. 


There is a simple way to tell whether a polynomial f(x) in F[x] is separable. 
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Lemma 11.5.2 The polynomial f(x) has no repeated zeros in its splitting field if 
and only if f (x) and f'(x) are relatively prime. 


Proof Let E > F be asplitting field for f(x) over F’, so that f(x) splits into distinct 
linear factors in E[x]: 


f(x) = & — ay) — a2)-+-(% — a,) € E[x], 
where a1, Q2,...,Q; € E. 


First we assume that f(x) has no repeated zeroes so that a; #4 a; whenever 
i # j. Using the above-mentioned Leibniz rule, we have 


f'%) = DIG — 01) ++ — O41 CK — 04) (% — 0441) +++ & — OF) 
i=l 

= SiGe = a1) = a1) — 04) (% — 41) ++ — O,), 
i=l 


where the notation (x — aj) simply means that the indicated factor has been removed 
from the product. From the above, it is obvious that f’(a;) #0, i = 1,2,...,7r,— 
that is to say, f(x) and f’(x) share no common factors in E[x]. From Lemma 11.5.1, 
it follows that f(x) and f’(x) are relatively prime in F [x], as required. 
Conversely, assume that f(x) and f’(x) are relatively prime in F[x]. Write 


f= O01)" @ a2)? + Ga)”, 


for positive integral exponents e), e2,...,e-. Again applying the Leibniz rule, we 
obtain 


#1) = Dea) + (4-1) (x — a) (ag) + ay). 
i=1 


If some exponent e; is greater than 1, then the above shows clearly that f’(a;) = 0, 
i.e., f(x) and f’(x) share a common zero, and hence cannot be relatively prime in 
E[x]. In view of Lemma 11.5.1, this is a contradiction. 


The preceding lemma has a very nice application when f (x) is irreducible. 


Lemma 11.5.3 /f f(x) € Fx] is irreducible, then f(x) has no repeated zeros in 
its splitting field if and only if f'(x) # 0. 


As an immediate corollary to the above, we see that if F is a field of characteristic 
0, then Fx] contains no irreducible inseparable polynomials. On the other hand if 
F has positive characteristic p, then by Lemma 11.5.2, an irreducible polynomial 
f(x) € F[x] has a repeated root only when f’(x) = 0, forcing f(x) = g(x”), 


376 11 Theory of Fields 


for some polynomial g(x) € F[x]. In fact, a moment’s thought reveals that, in fact, 
if f(x) € F[x] is irreducible and inseparable, then we may write f(x) = g(x py 
where e is a positive integral exponent and where g(x) € F[x] is an irreducible 
separable polynomial. 

Let F C E be an extension and let a € E. We say that a is separable over F if 
it is algebraic over F and if Irrr(q@) is a separable polynomial. The extension EF of 
F is said to be separable if and only if every element of E which is algebraic over 
F is separable. We may already infer the following: 


Lemma 11.5.4 The algebraic extension F C E is separable whenever F has char- 
acteristic 0 or is a finite field. 


Proof If F has characteristic 0, the result is obvious by the above remarks. If F is 
a finite field, the p-power mapping o : F — F is an automorphism, and hence is 


surjective. Now let f(x) € F[x] be irreducible and assume that f (x) is inseparable. 
m 


Write f(x) = g(x”), where g(x) = © ajx'. For eachi = 0,1,...,m, let bj € F 
i=0 
satisfy bP = a;; thus, 


m m P 
F(x) = g(x?) = bP x'? = (> ns!) : 
i=0 i=0 


contrary to the irreducibility of f(x). Thus, any finite extension of a finite field is 
also separable. 


We see, therefore, that if F C E is an inseparable algebraic extension, then F 
must be an infinite field of positive characteristic p. We shall take up this situation 
in the section to follow. Before doing this, it shall be helpful to consider two rather 
typical examples. 


Example 52 Let P = GF (2) be the field of two elements, let x be an indeterminate 
over P, and set F = P(x), the field of “rational functions” over P. Obviously, F is 
an infinite field of characteristic 2. Note also that F is the field of fractions of the PID 
P[x]. Now set f(y) = y? +x € F[y]; note that since x is prime in P[x], we may 
apply Eisenstein’s criterion (see Exercise (8) in Sect. 9.13.1, p. 318) to infer that f(y) 
is irreducible in Fy]. Thus, if a is a zero of f(y) ina splitting field over F for f(y), 
then [F (a) : F] = 2. Furthermore, since f’(y) = 2y +0 = 0 € F[x], we see that a 
is inseparable over F'. Therefore F(a) D F is an inseparable extension (that is to say, 
not a separable extension). However, we can argue that every element of F(a)\F is 
inseparable over F’,, as follows. Since {1, a} is an F'-basis of F(a), we see that every 
element of F(q)\F can be written in the form 6 = a+ba, a,b € F, b £0. Weset 


g(y) = (y — (a+ ba))* = y? + (a+ ba)’ = y? +.a* +x € Fy); 
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sincea+bais a zero of g(y), andsincea+ba ¢ F,wesee that g(y) = Itrp(a+ba). 
But as g'(y) = 0 we see that (3 is an inseparable element over F. Thus proves that 
every element of F(a)\F is inseparable over F’; we shall come to call such extensions 
purely inseparable. 


Example 53 As a hybrid example, we take P = GF(3), the 3-element field, and 
define F = P(x). Set f(y) = y° + x*y? +x € F[y]. Again, an application of 
Eisenstein reveals that f(y) is irreducible in F[y]. Since f’(y) = 0 we infer that 
f(y) is inseparable over F’. We take a to be a zero (in a splitting field) of f(y), 
from which we infer that [F(q@) : F] = 6. Furthermore, we know that this is not 
a separable extension since a is not separable over F. We note that f(y) = g(y°), 
where g(y) is the irreducible separable polynomial g(y) = y? + x7y +x € Fly]. 
This says that a°, being a root of g(y), is separable over F (note that a? ¢ F). 
Therefore, we see that F'(a)\F contains both separable and inseparable elements 
over F. In fact, we have a tower F C F(a?) C F(a), where [F(a?) : F] = 2 
and so [F(a) : F(a3)] = 3. Our work in the next section will show that, in fact, 
F (a?) contains all of the separable elements of F(a) over F, and that the extension 
F(a?) C F(a) is purely inseparable. 


11.5.2 Separable and Inseparable Extensions 


The primary objective of this subsection is to show that a finite-degree extension 
F C E can be factored as F € Egep C E, where Esep consists precisely of those 
elements of EF separable over F, and where Ey.) C FE is a purely inseparable 
extension (i.e., no elements of E — Eyep are separable). 

Suppose that F is a field of positive characteristic p, and that K D F is an 
extension of finite degree. We have the pth power mapping K > K, atrea?,a 
K. Note that if F is not finite, K is not finite, and we cannot infer that this mapping 
is an automorphism of K—we can only infer that it gives an embedding of K into 
itself (as explained in Lemma 7.4.6 of Sect.7.4.2, p. 221). We denote the image by 
K?, and denote by F K? the subfield of K generated by the subfields F and K?. 

We shall have need of the following technical result: 


Theorem 11.5.5 Let F be a field of positive characteristic p, and let K > F be an 
extension of finite degree. If K = F K?, then the pth power mapping on K preserves 
F -linear independence of subsets. 


Proof Since any F-linearly independent subset of K can be completed to an F- 
basis for K, it suffices to prove that the pth power mapping preserves the linear 
independence of any F’-basis for K. Suppose, then, that {a),..., a,} is a F-basis of 
K, and that c is an element of K. Then c can be written as an F-linear combination 
of the basis elements: 


c= aya; +--+ + andy, alla; € F, 
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from which we conclude that 
PP Pp 


c= aa; +---taral. 


Therefore, 
K? = FPa? +...4+ FPa? 


and so, by hypothesis, 
K = FK? = Fal +---+ Fah. 


Thus the pth powers of the basis elements a; form a F-spanning set of K of size n. 
Since n = dimr K, these n spanning elements must be F’-linearly independent. 


Theorem 11.5.6 (Separability criterion for fields of prime characteristic) Let K be 
any algebraic field extension of the field F, where F (and hence K) have prime 
characteristic p. 


(i) If K is a separable extension of F, then K = FK?. (Note that the degree [K : F] 
need not be finite here.) 
(ii) If K is a finite extension of F such that K = FK?, then K is separable over F. 


Proof Assume, as in Part (i), that K D F is a separable extension. Then it is clear 
that every element of K is also separable over the intermediate subfield L := FK? . If 
a € K, thenb = a? € L and soa isa zero of the polynomial x? — b € L[x]. Thus, 
if p(x) = Irrz(a) € L[x], we have that p(x) divides x? — b in L[x]. However, in 
K [x], x? —b =x? —a? = (x —a)? and so p(x) cannot be separable unless it has 
degree 1. This forces p(x) = x —a € L[x], i.e., that a € L, proving that K = FK?. 

For Part (ii), assume that [K : F'] is finite, and that K = FK?. Suppose, by way 
of contradiction that a is an element of K which is not separable over F’. Then if 
f(x) := Itrr(a), we have that f(x) = g(x”), where g(x) € F[x] is irreducible. 

m 


Write g(x) = © ajxi and conclude that 
j=0 


0 = f(a) = g(a?) = a9 + aya? +--+ + ana", 


which says that {1, a?, a7”, ..., a’”?} is an F-linearly dependent subset of K On the 
other hand, if {1, a, az...., a'"} were F-linearly dependent, then there would exist a 
polynomial p(x) in F[x] of degree at most m having element a for a zero, contrary to 
the fact that f(x) = Irrp(a) has degree pm > m, whichis acontradiction. Therefore, 
the pth power mapping has taken the F-linearly independent set {1, a,a’,...,a’"} 
to the F-linearly dependent set {1, a”, a2P,..., qm }, a violation of Theorem 11.5.5. 
Therefore, a © K must have been separable over F in the first place. 


Corollary 11.5.7 (Separability of simple extensions) Fix a field F of prime charac- 
teristic p, let F C K be a field extension, and let a € K be algebraic over F. The 
following are equivalent: 
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(i) F(a) > F is a separable extension; 
(ii) a is separable over F; 
(iii) F(a) = F(a?). 


Proof (i)=(i1) is, of course, obvious. 

Assume (ii). Since a is a zero of the polynomial x? — a? € F(a?)[x], we see 
that Irr (qr) (a) divides x? — a?. But x? —a? = (x — a)? € F(a)[x]. Since a is 
separable over F’, itis separable over F(a”) and so it follows that a € F(a”), forcing 
F(a) = F(a?), which proves (ii)=> (111). 

Finally, if F(a) = F(a?), then F - F(a)? = F- FP? (a?) = F(a?) = F(a), by 
hypothesis. Apply Theorem 11.5.6 part (ii) to infer that F(a) is separable over F, 
which proves that (i1i)=>(i). 


Corollary 11.5.8 (Transitivity of separability among finite extensions) Suppose K 
is a finite separable extension of L and that L is a finite separable extension of F. 
Then K is a finite separable extension of F. 


Proof That[K : F] is finite is known by Lemma | 1.2.2, so we only need to prove the 
separability of K over F. We may assume that F has prime characteristic p, otherwise 
K is separable by Lemma 11.5.4. By Part (ii) of Theorem 11.5.6, it suffices to prove 
that K = FK?. But, since K is separable over L, and since L is separable over F, 
we have 

K = LK? = (FL?)K? = F(LK)? = FR’. 


That K is separable over F now follows from Theorem 11.5.6, part (ii). 


Corollary 11.5.9 Let K be an arbitrary extension of F. Then 
Ksep = {a € K|a is separable over F} 


is a subfield of K containing F. 


Proof It clearly suffices to show that K sep is a subfield of K. However, ifa, b € Ksep, 
then by Corollary 11.5.7 we have that F(a) is separable over F and that F(a, b) is 
separable over F(a). Apply Corollary 11.5.8. 


The field Kgep is called the separable closure of F in K. 

Finally, recall that we have called an algebraic extension F C K purely insepa- 
rable if every element of K \ F is inseparable over F’. Therefore, we see already that 
if F C K is an algebraic extension, then K sep is a separable extension of F’,, and, by 
Corollary 11.5.8, K is a purely inseparable extension of Keep. 

We conclude this section with a characterization of purely inseparable extensions. 


Theorem 11.5.10 Let F C K be analgebraic extension. The following propositions 
are equivalent: 
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(1) K isa purely inseparable extension of F; 

(2) Foralla € K\F, a? € F, for some positive exponent e of the positive charac- 
teristic p; 

(3) For alla € K\F, Irrr(a) = gm by for some b € F and some positive 
exponent e. 


As a corollary to Theorem 11.5.10 we extract the following useful corollary for 
simple extensions. 


Corollary 11.5.11 Let F be a field of characteristic p, contained in some field K. 
Assume that a € K satisfies a?’ € F. Then the subfield F(a) © K is a purely 
inseparable extension of F. 


Proof If [F(a) : F] =r, then any element b € F(a) can be expressed as a polyno- 


r—1 


mial ina: b = > aja, where the coefficients ag, a,,...,d;—1 € F. But then 
j=0 
pe 
r-1 r—-1 
e . € se 
bP = > aja! = at alP € F, 
j= j=0 


Now apply Theorem 1 1.5.10. 
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11.6.1 Galois Field Extensions 


Let K be any field, and let k be a subfield of K. The k-isomorphisms of K with itself 
are called k-automorphisms of K. Under composition they form a group which we 
denote by Gal(K /k). The group of all automorphisms of K is denoted Aut(K), as 
usual. It is obvious that Gal(K/k) is a subgroup of Aut(K). 

Suppose now that G is any group of automorphisms of the field K. The elements 
fixed by every automorphism of G form a subfield called the fixed subfield of G 
(and sometimes the field of invariants of G), accordingly denoted invg(K). If G < 
Gal(G/k) then clearly k is a subfield of invg(K). 

Now let 2g be the poset of all subgroups of G, and let 2x /, be the poset of 
all subfields of K which contain k; in both cases we take the partial order to be 
containment. To each subfield L with k < L < K, there corresponds a subroup 
Gal(K/L). Similarly, for each subgroup H of Gal(K/k), there corresponds the 
subfield inv 4 (K ) containing k. These correspondences are realized as two mappings: 


Gal(K /e) : Qk /4 —> Qe, inv. (K): 2g — QkKjK, 


11.6 Galois Theory 381 


which are obviously inclusion reversing. Furthermore, for all E € 0x7, and for all 
H € Qe, one has that 


inv.(K) o Gal(K/e)(E£) = invgaK/zE)(K) > FE, and 
Gal(K /e) o inv.(E)(H) = Gal(E/invy(E)) => H. 


In plain language, the composition of the two poset morphisms in either order, is 
monotone on its defined poset. Therefore, we see that the quadruple (Qk x, Qc, 
Gal(K /e), inv.(K)) is a Galois connection in the sense of Sect. 2.2.15. 

Next, if we assume, as we typically shall, that [K : k] < o, then every interval 
in Qx/, is algebraic in the sense of Sect. 2.3.2.° In fact, the mapping which takes 
the algebraic interval [£,, E2] of Qx/, to the index [Ez : E)] isa Z-valued interval 
measure in the sense of Sect. 2.5.2. We shall show presently that when [K : k] < ~, 
|Gal(K /k)| < oo, and so similar comments apply to the poset Qg, where, if Hy < Hy 
are subgroups of G = Gal(K/k), then the mapping which takes the interval [H2, 1] 
to the index [H| : Hp] is the appropriate Z-valued interval measure. 


11.6.2 The Dedekind Independence Lemma 


We begin with the following observation. Let K be a field; let S be a set; and let K s 
be the set of mappings S > K. We may give K° a K-vector space structure by point- 
wise operations. Thus, if fi, fo € K*, and if a € K, then we set ac(fi + fo)(s) = 
fils) + afrls). 


In terms of the above, we state the following important result. 
Lemma 11.6.1 (Dedekind Independence Lemma) 


I. Let E, K be fields, and let 0, 02,..., 0, be distinct monomorphisms E > K. 
Then 01, 02,...,0, are K-linearly independent in KE, the K-vector space of 
all functions from E to K. 

2. Let E be afield, and let G be a group of automorphisms of E. We may regard each 
a € E as an element of E®& viaag + o(a) € E, 0 € G. Now set K = invg(E) 
and assume that we are given K -linearly independent elements a1, 02,..., %y € 
E. Then aj, a2, ..., Ay are E-linearly independent elements of ES, 


Proof For Part 1, suppose, by way of contradiction, that there exists a nontrivial linear 
dependence relation of the form ajo, +---+d-0, =O € K®=,ay,...,a, € K. 
Among all such relations we may assume that we have chosen one in which the 
number of summands r is as small as possible. 

We have, for all a € E, that 


a0 \(Q) + a202(@) +--+ + a-o;(a) = 0,” (11.6) 


Recall that a poset is algebraic if its “zero” and “one” are connected by a finite unrefinable chain. 
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where each coefficient a; is non-zero by the minimality of r. Since 01 4 o2, we may 
choose a’ € E such that 0) (a’) 4 02(a’). If we replace the argument a in Eq. (11.6) 
by a’a and use the fact that each o;, i = 1, 2,..., 7 isa homomorphism, we obtain 
the following: 

ajo1(a’)oy (a) + a202(a’)o2(a) +--+ + a,0;(a’)o;(a) = 0. (11.7) 
Next, multiply both sides of Eq. (11.6) by the scalar oj (a’): 

ayo1(a’)o1 (a) + aza1 (a’)o2(a) + +++ + aro (ao; (a) = 0. (11.8) 
Subtracting Eq. (11.8) from Eq. (11.7) yields 

az(a2(a") — 01 (a’))o2(a) + +++ + ar (or(a’) — 01 (0'))o (a) = 0. 
Since a2(a2(a’) — o1(a’)) # 0, and @ was arbitrary, we have produced a non-trivial 
dependence relation among the o;,i € 1, against the minimal choice of r. Thus no 
such dependence relation among the maps {a;} exists and Part | is proved. 

For part 2, we again argue by considering an E-linear dependence relation among 

the functions a; : G — E with a minimal number of terms r. Thus one obtains a 


relation 
aja(ay) + a20(a2) +--+: +a,o(a,) = 0, (11.9) 


valid for all o € G, and where we may assume that the elements a1, d2,...,a, € E 
are all nonzero. We may as well assume that a; = 1. Then setting o = 1 € Gin 
Eq. (11.9) yields 
a, +a202+---da-a, = 0. 
Since a}, a2,..., a, are linearly independent over K, the preceding equation implies 
that at least one of the elements a2, ..., a; isnotin K. Re-indexing the a; if necessary 
we may assume that a2 ¢ K. Therefore, there exists an element o’ € G such that 
a’ (az) # az. Equation (11.9) with o replaced by o’c then reads as: 
o'a(ay) + ago'a(az) +++: +a,0'o(a,;) = 0, (11.10) 

still valid for all o € G. Applying o’ to both sides of Eq. (11.9) yields 

o'a(a1) + 0’ (ax)o'a(a2) +--+ +0’ (a,)a'o(a;) = 0, (11.11) 
and subtracting Eq. (11.11) from Eq. (11.10) yields 


(az — 0 (az))o’o(a2) + +++ + (ay — 0 (a,))o' (ay) = 0. 
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Since this is true for all o € G, and since az — o’(az) 0, we have produced 
a dependence relation on a smaller set of functions {0 th a(a;)|i > 1}, which 
contradicts the minimality of r. Thus no dependence relation as described can exist 
and Part 2 must hold. 


From the Dedekind Independence Lemma, we extract the following lemma, which 
summarizes the relationships between field extension degrees and group indices. 


Theorem 11.6.2 Letk C K be afield extension, with Galois group G = Gal(K /k). 


I. Assume thatk © E, © E2 C K isatower of fields, and set H; = Gal(K /Ej), i = 
1, 2. If[E2 : E,] < @%, then [HM : Ho] < [E2: E}). 

2. Assume that Hz < H, < G are subgroups, and set E; = invy,(K), i = 1, 2. If 
[H, : Ho] < «, then [E2: E,| < [M1 : Ad]. 


Proof For part 1, we shall assume, by way of contradiction, that [H, : H2] > [E2: 
E,). Set r = [Ez : EF], and let {a1,...,a,-} be an E,-basis of Ez. Assume that 
{01, 02,..., 0s}, 5 > r,isaset of distinct left H-coset representatives in H). Since 
Ss >r, we may find elements a), a2,...,as € K, not all zero, such that 


AY 
>) aioi(aj) = 0; jf S152; ..247. 
i=l 


Since any element of £2 can be written as an E-linear combination of a1,..., a, 
AY 


we conclude that >° ajo; : Ez > K is the 0-mapping. Since a), ..., as are distinct 
i=1 

coset representatives, they are distinct mappings E, — K. This contradicts part 1 

of the Dedekind Independence Lemma. 


For part 2 of the Theorem, we shall assume, by way of contradiction, that [E> : 


E,| > [M : Ap]. Let {o1,...,0,-} be a complete set of H2-coset representatives in 
A, and assume that {a,..., a@s}is an Ej-linearly independent subset of E2, where, 
by assumption, s > 7. Again, we may find elements a1, a2,...,a; € K,notall zero, 
such that 


AY 
> aio j(a) =0, 7 S125 5265% 


=! 


Ifo € M, then o = oj7, for some index j, 1 < j < r and for some 7 € Mp. 
Therefore, as E> is fixed point-wise by H2, we have 


AY S S 
> aoa) — >) aio {7 (ai) — >) aio; (0%) = 0. 
i=l i=l i=l 


Since not all a; are zero in the first term of the equation just presented, the mappings 
a +> o(a;) € K are not K-linearly independent, against part (ii) of the Dedekind 
Independence Lemma. 
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Corollary 11.6.3 Let k C K be a finite degree extension, and set G = Gal(K/k). 
If ko = invg(K), then 
|G| = [K : ko]. 
Proof Note first that G = Gal(K /ko). We have 
[K : ko] = |G| = [K : ko], 


where the first inequality is Theorem 11.6.2, part (1), and the second inequality is 
Theorem 11.6.2, part (2). The result follows. 


11.6.3 Galois Extensions and the Fundamental Theorem 
of Galois Theory 


One may recall from Sect. 2.2.15 that with any Galois connection between two posets, 
there is a closure operator for each poset. That notion of closure holds for the two 
posets that we have been considering here: the poset of subgroups of Gal(K /k) and 
the poset of subfields of K that contain k. Accordingly, we say that a subfield k of 
K is Galois closed in K if and only if 


k = invG@al(K/k) (K). 


We define an algebraic extension k C K to be a Galois extension if k is Galois 
closed in K. Note that from the property that K/k is a Galois extension, we can 
infer immediately that every subfield E € 02x, of finite degree over k is also Galois 
closed in K. Indeed, we can set G = Gal(K/k), E= iNVGal(K/E)(K) (the Galois 
closure of EF in K), and use Theorem 11.6.2 to infer that 


[E:k] > [G:Gal(K/E)] > [E: kl]. 
Since we already have E C E, the result that E = E follows immediately. 
We characterize the Galois extensions as follows. 


Theorem 11.6.4 Letk C K bea finite extension. The following are equivalent: 


(i) k © K is a Galois extension; 
(ii) k C K isa separable, normal extension. 


Proof Assume that k C K is a Galois extension. Assume that f(x) € k[x] is an 
irreducible polynomial having a zero a € K. Let {a; = a,az,...,a,;} be the 
G-orbit of a in K, where G = Gal(K/k). Set 


g(x) = | [x — ar) € Kix}. 


i=1 
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Note that since each o € G simply permutes the elements qj, ...,a,, we infer 
immediately that foreach 0 € G,o*g(x) = g(x). (Here o* is the ring automorphism 
of K [x] that applies o to the coefficients of the polynomials. See Sect. 11.3.1, p. 363.) 
Therefore the coefficients of g(x) are all in invg(K) = k, as k is closed in K. 
Therefore g(x) € k[x]; since g(a) = 0, f(x) must divide g(x), which implies 
that f(x) splits completely in K[x]. Since f(x) was arbitrarily chosen in k[x], we 
conclude that K is a normal extension of k. 

Note that the above also proves that the arbitrarily-chosen irreducible polynomial 
f(x) € k[x] is separable. Applying this to f(x) = Irrg(a@), a € K, we see that K 
is a separable extension of k, as well. 

Conversely, assume that the finite extension k C K is a separable normal exten- 
sion. Let a € K\k and set f(x) = Irrg(x). Then f(x) is of degree at least two 
and splits into distinct linear factors in K[x]. Thus, if @ € K is another zero of 
Ff (x), then by Lemma 11.3.1 there exists a k-isomorphism k(a) — k(() taking a 
to 3. Next, by Theorem 11.3.4 this isomorphism can be extended to one defined on 
all of K. Therefore, we have shown that for all a € K\k, there is an element of 
a € G = Gal(K/k) such that o(a) 4 a. It follows that k = invg(K), proving the 
result. 


Theorem 11.6.5 (The Fundamental Theorem of Galois Theory) Suppose K is a 
finite separable normal extension of the the field k. Let G := Gal(K/k), let S(G) be 
the dual of the poset of subgroups of G, and let 2x /_ be the poset of subfields of K 
which contain k. Then the following hold: 


(i) (the Galois Connection) The mappings of the Galois correspondence 


Gal(K /e) 
G) SS Qk 
inve(K) 


are inverse to each other, and hence are bijections. 
(ii) (The connection between upper field indices and group orders) For each inter- 
mediate field L € Qx/x 


[K : L] =|Gal(K/L)|. 


(iii) (The connection between group indices and lower field indices) If H < G, then 
[invy7(K):k] =[G: H]. 
(iv) (The correspondence of normal fields and normal subgroups). 


1. L € Qx/, is normal over k if and only if Gal(K /L) is a normal subgroup 
of G = Gal(K/k). 
2. N <G ifand only if invy(K) is a normal extension of k. 
(v) (Induced groups and normal factors) Suppose L is a normal extension of k 


contained in K. Then Gal(L/k) is isomorphic to the factor group Gal(K /k)/ 
Gal(K/L). 
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Proof Thanks to our preparation, the statement of this Theorem is almost longer 
than its proof. Part (i) follows by the discussion at the beginning of this subsection. 
By part (i) together with Theorem 11.6.2, one has, for any subfield L € Qx x, that 


[K : L] > [Gal(K/L) : 1] = |Gal(K/L)| > [K : inveayx/1)(K)] = LK : L]. 
Likewise, for any subgroup H < G, one has 
[G : H] > [invy(K):k] > [G: Gal(K/invy(K))] =[G: A], 


proving both parts (ii) and (iii). 

We prove part (iv) part 1: If L € Qx/, is normal over k, then by Corollary 11.3.8 
K is G-invariant. This gives a homomorphism G — Gal(L/k), o +> o|,; and the 
kernel of this homomorphism is obviously Gal(K/L) < G. 

For (iv), part 2, suppose N <I G, and choose a € invy(K). Then for any o € G 
and n € N we have no(a) = o(a~!no)(a) = o(a), where we have used the fact 
that o~'no € N and N fixes point-wise the elements of invy(K). This proves that 
a(a@) € invy(K). Now apply Corollary 11.3.8 to conclude that invy (K ) is a normal 
extension of k. 

Finally, we prove part (v): Now we have observed above that when L € Qk 
is a normal extension of k, there a homomorphism G — Gal(L/k) having kernel 
Gal(K/L). 

This homomorphism can be seen to be surjective by two distinct arguments. 
(1) First a direct application of Corollary 11.3.8 (with (K, L,k) playing the role 
of (E, K, F) of that Corollary) shows that any automorphism in Gal(L/k) lifts to 
an automorphism of Gal(K /k). (2) A second argument achieves the surjectivity by 
showing that the order of the image of the homomorphism is at least as large as the 
codomain Gal(L/k). First, by the fundamental theorem of group homomorphisms, 
the order of the image is [G : Gal(K/L)]. By (iii), 


[G : Gal(K/L)] = [inveaK1)(K) : &]. 


But by definition, invgai(x/z)(K) contains L, and so the field index on the right side 
is at least [L : k]. But since L/k is also a Galois extension, one has [L : k] = 
|Gal(L/k)|, by (ii) applied to L/k. Putting these equations and inequalities together 
one obtains 


|G/Gal(K/L)| > |Gal(L/k)| 


and so the homomorphism G — Gal(L/k) is again onto. 

Now, since the homomorphism is onto, the fundamental theorem of group homo- 
morphisms shows that the factor group G/Gal(K /L) is isomorphic to the homomor- 
phic image Gal(L/k). 


11.7. Traces and Norms and Their Applications 387 


11.7 Traces and Norms and Their Applications 


11.7.1 Introduction 


Throughout this section, F is a separable extension of a field k, of finite degree 
dim, (F) = [F : k]. 

Let E be the normal closure of F over k. Then E is a finite normal separable 
extension of A—that is, itis a Galois extension. Accordingly, if G = Gal(E/k) and H 
is the subgroup of G fixing the subfield F' element-wise, we have [G : H] = [F: k]. 
Let 0) = 1, 02,..., Om be any right transversal of H in G.’ We regard each oj; as an 
isomorphism F — E. 

The trace and norm are two functions Tr/, and Nr /, from F into k, which are 
defined as follows: for each a € F, 


n 

Trjx(a) = >> 0%, (11.12) 
i=l 
n 

Nrjk(a) = []e”- (11.13) 
i=l 


The elements {a} list the full orbit a& of a under the (right) action of G, and 
so does not depend on the particular choice of coset representatives {o;}. Since the 
orbit sum 7/%(@) and orbit product Nr/,(@) are both fixed by G, and E/K isa 
Galois extension, these elements must lie in k = invg(E£). 

When the extension k C F is understood, one often writes T (a) for Tr/,(a@) and 
N(q) for Nr/x(q). 

The formulae (11.12) and (11.13) imply the following: 


T(a+ 8) =T(a)+T(8),a,BeF (11.14) 
T(ac) = T(a)c,aé F,c Ek, (11.15) 


so that T is a k-linear transformation F — k. 
Similarly, for the norm (11.12) and (11.13) yield 


N(aB) = N(o)N(S), 0,8 € F (11.16) 
N(ac) = N(a)c",a€ Fic ek. (11.17) 


7Recall from Chap.3 that a right transversal of a subgroup is simply a complete system of right 
coset representatives of the subgroup in its parent group. In this case {H;} lists all right cosets of 
HinG. 
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11.7.2 Elementary Properties of the Trace and Norm 


Theorem 11.7.1 (The Transitivity Formulae) Suppose k C F C L is a tower of 
fields with both extension L/F and F'/k finite separable extensions. Then for any 
element a € L, 


Ti /k(Q) = Trjx(TL/F(@)) (11.18) 
Nixa) = Nr/x(Nz/Fr(Q)). (11.19) 


Proof Let E be the normal closure of L so that E/k is a Galois extension. Set 
G = Gal(E/k), H = Gal(E/F) and U = Gal(E/L). Then U < H < G, and 


If X is a right transversal of H in G, and Y is right transversal of U in H, then 


nos > a= 3 ps «) (11.20) 
oEXY oEX \TEY 
= >) (T1)r(a))” = Tr/x (Tir (Q)). (11.21) 
oEXx 


The anagolous formula for norms is obtained upon replacing sums by products 
in the preceding Eq. (11.21). 


Corollary 11.7.2 [fk C F is a separable extension of degree n = |F : k], and if 
a € F has monic irreducible polynomial 


irr(a) = x7 4 aaa ++---+ajx +ao € k[x], 


then 
Trx(@) = —(n/d)ag-}. (11.22) 


Similarly, 
Nrjx(a) = ((—1)4a9)"/4. (11.23) 


Proof Let E be the normal closure of F’,, so that we have the factorization 


irr(@) = (x — 01) +++ (®% — Aa) 
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in E[x]. Then —ag_; = >° 0; and [[0; = (—1)4ao. But > 9; = Thay/k(a) and 
14 = Nx(a)/k(). Now applying the transitivity formulae (11.18) and (11.19) for 
the extension tower k C k(q) C F one obtains 


TE/k(a)(@) = Taye Tr /k(a)(@) = Tray/k(@) > LF: k(@)] 
since a € k(q). Similarly 


Nejk(@) = Necase (a) PO, 


11.7.3 The Trace Map for a Separable Extension Is Not Zero 


Theorem 11.7.3 Suppose k C F is a finite separable extension. Then the trace 
function Tp, : F — k is not the zero function. 


Proof Let E be the normal closure of F; it then follows that k C E is a Galois 
extension. Let G = Gal(E/k), and let {o1, ..., 7} be a complete listing of its ele- 
ments. By the Dedekind Independence Lemma (Lemma 11.6.1), part 1, the functions 
o; : E — E are k-linearly independent. In, particular, the mapping >° 0; defined by 
at> >71_, a7 cannot be the zero function. So there is an element 3 € E such that 


OF >) 8% = Te(6). (11.24) 


Set 6’ = Te/r(Q). Then 3’ lies in F since F C E is also a Galois extension. Now 
if Tr/k(8’) = 0, then 


Tr /%(G) = Tr/x(Tejr(8) = Trx(2') = 0, 


against Eq. (11.24). Thus we have found 3’ € F such that T/,(G’) 4 0 showing 
that the trace mapping 7;/, is not the zero mapping. 


Associated with the trace function Tr/, : F — k is a symmetric bilinear form 
Br : F x F — k called the trace form defined by 


Br (a, 3) = Tr/x(@P), a, 8 € F. 


This form is said to be non-degenerate if Br (a, 3) = 0 forall 3 € F impliesa = 0.8 


8The reader is referred to Sect.9.12 where these concepts were applied to all k-linear maps T : 
Fok. 
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Corollary 11.7.4 Let F be a finite separable extension of the field k. Then the trace 
form is non-degenerate. 


Proof If Br(a, 3) = 0 for all 8 € F, then Tr/,;(aZ) = 0 for all 8 € F. If 
a # 0, this would imply 7/;(/') = 0, contrary to the conclusion of the preceding 
Theorem 11.7.3. 


11.7.4 An Application to Rings of Integral Elements 


Suppose D is an integral domain, which is integrally closed in its field of fractions 
k. (Recall that this means that any fraction formed from elements of D that is also 
a zero of a monic polynomial in D[x], is already an element of D.) Suppose F is 
a finite separable extension of k, and let Or be the ring of integral elements of F 
with respect to D—that is, the elements of D which are a zero of at least one monic 
polynomial in D[x]. 

In Sect.9.12 of Chap.9, a k-linear transformation t : F — k was said to be 
tracelike if and only t(Or) C D. Theorem9.12.2 then asserted the following: 


If there exists a non-zero tracelike transformation t : F — k, then the ring Of is 
a Noetherian D-module. 


But now we have 


Lemma 11.7.5 Let F be a finite separable extension of k, the field of fractions of 
the integrally closed domain D, and let Or be the ring of integral elements as in the 
introductory paragraph of this subsection. Then the trace function Tr/% : F — k is 
a tracelike k-linear transformation. 


Proof It is sufficient to show that if a € Or, then Tr/x(@) € Or, for in that case 
Tr/x(a@) € Or Nk = D, since D is integrally closed. Let E be the normal closure of 
F andset G := Gal(E/k). Foreacha € G,anda € Of, a° isalsoa zero of the same 
monic polynomial in D[x] that a is; so it follows that 7(O;,) © Og. Now Tr x(a) 
is the sum of the elements in the orbit a© and so, being a finite sum of elements of 
Ox, must lie in Og as well as k. Thus Tr, (a) € OF Nk C OF NF = Or. 


Corollary 11.7.6 Let D be an integral domain that is integrally closed in its field 
of fractions k. Letk © F be a finite separable extension, and let Or be the ring 
of integral elements (with respect to D) in the field F. Then Or is a Noetherian 
D-module. 


Proof By Theorem 9.12.2 it is sufficient to observe that the trace function Tp, is 
a non-zero tracelike transformation F — k. But as F is a separable extension of 
k, these two features of the trace function are established in Theorem 11.7.3 and 
Lemma 1! 1.7.5. 
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11.8 The Galois Group of a Polynomial 


11.8.1 The Cyclotomic Polynomials 


Let n be a positive integer, and let ¢ be the complex number ¢ = e?”!/". Set 


n(x) =| [«-¢%), 
d 


where | < d <n, and gcd(d,n) = 1. We call ®, (x) the nth cyclotomic polynomial. 
Thus, we see that the zeros of ®, (x) are precisely the generators of the unique cyclic 
subgroup of order n in the multiplicative group C* of the complex numbers, also 
called the primitive nth roots of unity. It follows that the degree of ®,(x) is the 
number ¢(n) of residue classes mod n which are relatively prime to n.? Note in 
particular that 

x"—1=] | ba). (11.25) 


d|n 


Since each nth root of unity is a power of the primitive root ¢ we see that the field 
K = Q(C) is the splitting field over Q for x” — 1. If we set G = Gal(K /Q), then 
G clearly acts on the zeros of ®, (x) (though we don’t know yet that this action is 
transitive!), and so the coefficients of ®,(s) are in invg(K) = Q. In fact, however, 


Lemma 11.8.1 For each positive integer n, ®,(x) € Z[x]. 


Proof We shall argue by induction on n. First one has ®; (x) = x — 1 € Z[x], so the 
assertion holds for n = 1. Assume n > 1. In Eq.(11.25), c(x) := x” — 1 is written 
as a product of cyclotomic polynomials, ®y (x), all of which are monic by definition. 
By induction 
a(x):= |] eae) € Zfx], 
d\m 
l<d<m 


(where d ranges over proper divisors of 1) is a product of monic polynomials in Z[x], 
and so itself is such a polynomial. Setting b(x) := ®, (x), we see that a(x), b(x) and 
c(x) are monic polynomials with c(x) = a(x)b(x), with a(x) and c(x) in Z[x] and 
with b(x) monic in Q[x]. Now we apply Theorem 9.13.3 (with Z and Q in the roles 
of the domains D and Dj, respectively—see Exercise (11) in Sect.9.13.1, Chap. 9), 
to conclude that b(x) = ®,(x) is also in Z[x]. The induction proof is complete. 


Theorem 11.8.2 For each positive integer n, the nth cyclotomic polynomial ®, (x) 
is irreducible in Q{[x]. 


°The function: ¢ : Z — Z is called Euler’s totient function or simply the Euler phi-function. 
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Proof Since ®,(x) € Z[x], we may invoke Gauss’s lemma and be satisfied with 
proving that ®,[x] is irreducible in Z[x]. Thus, assume that ®,(x) = h(x)k(x), 
where h(x), k(x) € Z[x] and h(x) is monic and irreducible. Let p be a prime not 
dividing n, and let ¢ be a zero of h(x) in a splitting field F for ®,(x) over Q. We 
shall show that ¢? is also a zero of h(x). Note that since p and n are relatively prime, 
CP is another zero of ®, (x). Assuming that ¢? is not a zero of h(x), it must be a zero 
of k(x), forcing ¢ to be a zero of the polynomial of k(x”). This implies that h(x) 
divides k(x”), and so we may now write 


K(x?) = h(x) (11.26) 


for some monic polynomial /(x) € Z[x]. 
At this point, we may invoke the ring homomorphism 


mp : Z[x] > (Z/(p)) [x], 


which preserves degrees but reads the integral coefficients of all polynomials modulo 
p. For each polynomial p(x) € Z[x], we write p(x) for mp(p(x)). 

Since 7 is relatively prime to p, there exists an integer b such that bn = 1 
mod p. Since (bx)O(x” — 1) — x" — 1) = 1 in Z/(p))[x], by Lemma 11.5.3, 
the polynomial x” — 1 must have distinct zeroes in its splitting field K over Z/p. 
So this also must be true of any factor of the polynomial x” — 1. We now have 
x" —1= ,(x) f(x) = h(x)k(x) f (x), so ®, (x) is such a factor. 

Now, applying m to each side of Eq. (11.26), one obtains 


A(x)l(x) = k(x?) = k(x)? € (Z/p)ix1. 


Thus the zeroes of # in K can be found among those of k(x). Since ®, (x) = 
h(x)k(x), we see that ®, (x) has repeated zeroes in K, contrary to the observation 
in the previous paragraph. 

What the above has shown is that if ¢ is a zero of h(x) in the splitting field F, 
then so is ¢?, for every prime p not dividing n. Finally, let 7 be any primitive n-root 
of unity (i.e., a zero of ®,(x) in F). Therefore, 7 = ¢” for some integer r relatively 
prime to n. We factor r as r = Bi a -.» ps°; then as each p; is relatively prime to 
n, and since the p;th power of a zero of h(x) is another zero of h(x), we conclude 
that 7 = ¢’ is also a zero of h(x). It follows that all zeros of ®, (x) are zeroes of its 
irreducible monic factor h(x), whence h(x) = ®(x), completing the proof. 


11.8.2 The Galois Group as a Permutation Group 


Let F be a field and let f(x) € F[x] be a polynomial. If K C F is a splitting 
field over F for f(x), and if G = Gal(K/F), we call G the Galois group of the 
polynomial f (x). If a1, a2,..., ag are the distinct zeros of f(x) in K, then, since 


11.8 The Galois Group of a Polynomial 393 


K = F(qaj,a2,..., x), we see that the automorphisms in G are determined by 
their effects on the elements a1, ..., a. Furthermore, as the elements of G clearly 
permute these zeros, we have an injective homomorphism G —> S,; (where Sx is 
identified with the symmetric group on the k zeros of f (x)) thereby embedding G as 
a subgroup of S;. Note finally that if f(x) is irreducible, then the above embedding 
represents the Galois group G as a transitive subgroup of Sx. 

For example, in the previous subsection we saw that the nth cyclotomic polynomial 
®, (x) € Z[x] is irreducible and of degree (1), where ¢ is Euler’s “totient” function. 
Setting k = o(n), ¢ = e?7'/", and G = Gal(Q(¢)/Q), we have an embedding of G 
into Sx. However, as G must act as a group of automorphisms of K = Q(j), we see 
in particular that it must restrict to a group of automorphisms of the cyclic group (C) 
of order n. Therefore, we have a (faithful) homomorphism G — Aut((¢)), the latter 
being abelian of order ¢(7) (see p. 392). Since |G| = o(n) = deg ®,, (x), and since 
the splitting field of ®, (x) is the same as that of x” — 1, we conclude that: 


Theorem 11.8.3 The Galois group of the polynomial x" — 1 is isomorphic with the 
automorphism group of a cyclic group of order n and is therefore abelian of order 
o(n), where ¢ is Euler’s totient function. 


The following is a useful summary of what has been obtained thus far. 


Theorem 11.8.4 Let f(x) € F[x] and let G be the corresponding Galois group. 
Assume that f (x) factors into irreducibles as 


f@) = || #@) =F. 


Let E be a splitting field over E of f (x), and let Aj be the set of zeros of fj(x) in E. 
Then G acts transitively on each Aj. 


Proof If qa, ao’ are distinct zeros of fj (x) in E the by Lemma 11.3.1 there is an isomor- 
phism F(a) — F(a’); by Theorem 11.3.4, this can be extended to an automorphism 
of E. 


From the above, we see that if A is the set of zeros of a polynomial f(x) in a 
splitting field E, we have an embedding G — Sym(A) of the Galois group into 
the symmetric group on A. Identifying G with its image in Sym(A), an interesting 
question that naturally occurs is whether G < Alt(A), the alternating group on A. To 
answer this, we introduce the discriminant of the separable polynomial f(x). Thus 
let f(x) € F[x], where char F # 2, and let E be a splitting field over F for f(x). 


Let {a1, a1,..., ax} be the set of distinct roots of f(x) in E. Set 
1 ay at - af! 
1 a2 as - ak! 


= I (aj — aj) = det 8 ek . 


1<j<i<k 


2 k-1 
1 ap ag + + aH 
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and let Dy = 6°. We call Df the discriminant of the polynomial f(x), sometimes 
denoted disc f(x). Note that Dy € invg(E); when f(x) is separable, this implies 
that Dr Ee F. 


Example 54 (Discriminant of a quadratic) Suppose that f(x) € F[x] is a monic 
separable quadratic; thus f(x) = x* + bx +c = (x — a1)(x — ap) ina splitting 
field over F for f(x). Therefore, we have 


(i) ay + a2 = —b, and 
(ii) ajya2 =. 


It follows that 
disc f(x) = (az — a1)” 
= aa + a3 — 2a,a2 


= (a1 +.a2)* — 4ayaz 
= b* — 4c e F[x], 


a formula for the discriminant familiar to every high school student. 


Before tackling the n = 3 case, we make a few observations of general interest. 
Relative to a1, a2,..., a, we define the ith power sum, i = 0,1, 2,...: 


Si =A +ay--- + aK. 


Since the determinant of a matrix is the same as the determinant of its transpose, we 
may express the discriminant of the polynomial f(x) thus: 


2 k-1 
1 1 1 aa 1 lay apc: Q 
k-1 
ay a2 a3 ++ Ag 1 az a3 ee NG 
: 2 2 2 2 2 k-1 
disc f(x) = det} % 05 OZ 7 AR 1 a a3 +++ Q3 : 
k-1 k-1 k= k-1 _ 
ay Ay Og OK 1 a, a2 atk ! 


from which one concludes that 


1 sy 82 Sk-1 
S182. SZ SK 


disc f(x) =det| 82 53) $4 77° Sktt |, (11.27) 


Sk-1 Sk Sk+1 *** S2k—2 
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Example 55 (Discriminant of a cubic) Suppose that f(x) = x3 +a2x*+a,x+ao € 
F [x] is a monic separable cubic, with zeros a1, a2, a3 in a splitting field. In this 
case, using Eq. 11.27 one obtains 


1 s, 82 
disc f(x) = det | 51 s2 53 


S52. $3 S4 


Next, note that the coefficients of f(x) are given by 


—o,:=—(a; +02 +43) = a2, 
02: =aja2 + A103 + A203 = 4}, 
—03 >.> -—a1072Q03 = do. 
Furthermore, one has s; = 0, = —az; furthermore, 


2= at oF 05 + a3 
= (a1 + a2 +03)" — 2(aya2 + A103 + 0203) 
= a = 202 
= ay — 2a}. 


3= a3 oF a3 a a3 
= (ay + a2 + 03)(aq + 3 + 03) 
—(a; + a2 + .03)(a1 a2 + A103 + A203) + 3a;a203 
= o1(oy — 202) — a102 — 303 
= o} — 30102 + 303 
= —a3 + 3aja2 — 3ao. 


= at + a5 + a4 

(a1 + a2 + 03)(a} + 03 + 03) 

—(aja2 + 0103 + a203)(at + a5 + a4) 
+aja203(a1 + a2 + 03) 


54 


= —o1(0} — 30102 + 303) 
—02(07 — 202) + 0103 
= ay _ 4aias + 4aga2 + Dae 
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From all of the above, one obtains, therefore, that 


3 —a2 ae — 2a 
disc f(x) = det —ar ax — 2a) —a3} + 3aa2 — 3a0 
ax — 2a, a + 3a,a2 — 3a0 ay _ daa, + 4aga2 + 2a? 


— a;as — 4a} — 4aga3 — 27ae + 18agajaz, 


after admittedly odious calculations! 


The above examples reveal a general trend, viz., that the discriminant of a poly- 
nomial can always be expressed in terms of the coefficients of the polynomial. To see 


why this is the case, we assume, for the moment that x1, x2,..., X, are commuting 
indeterminates, and define the elementary symmetric polynomials 01,02, ..., On; 
and the power sum polynomials 51, 82, ..., Sy by setting 
n 
Ok= > Xi Xin Xp» SE = ye ha, Yigg 1S 1 2c: 
1] <Ig<++<ig i=1 
Therefore, 


oO, = S$, =X, +x24+..-Xn, 
On = Do xpxj = XY XQ +113 +++ Xn_1 Xn, 


i<j 
sp =xptxst... $x, 
and so on. 


Next, note that 


n 
[]@ = x1) =" one! + 0px" $+ CD" on. 


i=1 


From this, it follows immediately that if the power sum polynomials 5), s2,... can 
be written as polynomials in the elementary symmetric polynomials 0), 02, ..., On, 
then from Eq. (11.27) we infer immediately that the discriminant of a polynomial 
can be expressed as a polynomial in its coefficients. Our detailed calculations of the 
discriminant of quadratics and cubics hint that this might be possible. We turn now 
to a demonstration that this can always be done. 

First, suppose that g(k) = g(x1,%2,..-,4n) € F[x1, x2,...,Xn]. We say that 
g(X) is a symmetric polynomial if it remains unchanged upon any permutation of the 
indices 1, 2,...,. Therefore, we see immediately that the elementary symmetric 
polynomials as well as the power sum polynomials are symmetric polynomials. 


Theorem 11.8.5 (Fundamental Theorem on Symmetric Polynomials) Any symmet- 
ric polynomial g(x) € F[x1,x2,...,Xn] can be expressed as a polynomial in the 
elementary symmetric polynomials. 
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Proof We clearly may assume that g = g(x) is homogeneous, i.e., that it is com- 
posed of monomials each of the same degree, say k. Next, we introduce into the set 


of monomials of the form ee ... x!" where ij +i2 +++: +in = k, the so-called 


lexicographic ordering. Thus we say that x}! +++ x," < x xf if iy = ji,ig = 


J2,---5l1 = jt, titi < ji4i1. Thus, ifn = 3 we have xi kone < Reena < XG as: 


Next, let x1 '1x5? +++ xn" be the highest monomial occurring in g with nonzero coef- 


ficient. Sines g is symmetric, it must also contain all monomials obtained from 


ce ae -+x;'" by permuting the indices. It follows, therefore, that we must have 
my =m. >-++2= My. 
Next, let d, do, ... , d, be exponents; we wish to identify the highest monomial in 


the symmetric polynomial ofl of ree a. Here it is useful to observe that if M,, M2 
are monomials of the same degree with M, < Mo, and if M is any monomial, then 
M\M < M2M. Having observed this, we note next that the highest monomial in 


. . cpt othe dj dj dj 
o; 1s Clearly x x2 --- x;. Therefore, the highest monomial in g;" is x4'xX7' -+-Xn'. In 
turn, the highest monomial in ofl aa . oi is therefore 

a3 +d2-+ ee +d3++--+dn a xan : 


From this, we see that the highest degree monomial in both g and and the polynomial 


Oy) dg” +O, I9 KH, +e -Xy_. This implies that if 
g = axy''x5?---x/" + lower monomials , 
on : * 
then the symmetric p ae g-aoy a5 "9 «on '" will only involve mono- 


mials lower than x/" 1 iy Care imple induction finishes the proof. 


Corollary 11.8.6 The discriminant of the separable polynomial f(x) € F(x] can 
be expressed as a polynomial (with coefficients in F ) in the coefficients of f (x). 


The permutation-group theoretic importance of the discriminant is the following. 


Theorem 11.8.7 Let f(x) € F[x], where char F 4 2 be apolynomial with discrim- 
inant Df = 6° = disc f (x) defined as above. Let G be the Galois group of f (x), 
regarded as a subgroup of Sym(n), acting on the zeros {a,+-- , Qn} in a splitting 
field E over F for f(x). If A := GNAIt(n), then inva(E) = F (0). 


Proof Note that any odd permutation of the zeros a1, ..., ax will transform 6 into 
—6 and even permutations will fix 6. 


The following is immediate. 


Corollary 11.8.8 Let G be the Galois group of f (x) € F[x], where char F 4 2. If 
the discriminant D ¢ is the square of an element in F, then G < Alt(n). 


The following is occasionally useful in establishing that the Galois group of a 
polynomial is the full symmetric group. 
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Theorem 11.8.9 Let f(x) € Q[x] be irreducible, of prime degree p, and assume 
that f (x) has exactly 2 non-real roots. Then G ¢ = Sym(p). 


We close by mentioning that for a general “trinomial,” there is a formula for the 
discriminant, due to R.G. Swan,!° given as follows. 


Theorem 11.8.10 Let f(x) = x" +ax* +b, andletd = g.c.d.(n,k), N = 5, K= 
- Then 


Dr = (—1) 28D pk-l py pN-k —: (-1)%(~n = ky a’ F. 


11.9 Solvability of Equations by Radicals 


11.9.1 Introduction 


Certainly one remembers that point in one’s education that one first encountered the 
quadratic equation. If we are given the polynomial equation 


ax? + bx +c =0, 


where a ¥ 0 (to ensure that the polynomial is indeed “quadratic’’), then the roots of 
this equation are given by the formula 


(—b + Vb? — 4ac)/2a. 


Later, perhaps, the formula is justified by the procedure known as “completing the 
square”. One adds some constant to both sides of the equation so that the left side is 
the square of a linear polynomial, and then one takes square roots. It is fascinating to 
realize that this idea of completing the square goes back at least two thousand years 
to the Near East and India. It means that at this early stage, there is the suggestion that 
there could be a realm where square roots could always be taken, and the subtlety 
that there are cases in which square roots can never be taken because the radican is 
negative. 

A second method of solving the equation involves renaming the variable x in 
a suitable way. One first divides the equation through by a so that it has the form 
x? + b'x +c! = 0. Then setting y := x — b’/2, the equation becomes 


yit+e¢—(')"/4=0, 


!0R.G. Swan, Factorization of polynomials over finite fields, Pacific Journal, vol 12, pp. 1099- 
1106, MR 26 #2432. (1962); see also Gary Greenfield and Daniel Drucker, On the discriminant of 
a trinomial, Linear Algebra Appl. 62 (1984), 105-112. 
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from which extraction of the square root (if that can be done) yields 
(4c’ — (b’)?)/2. The original zero named by x, is obtained by x = y + b’/2, 
and the equations b! = b/a and c' = c/a. 

Certainly the polynomials in these equations were seen as generic expressions 
representing relations between some sort of possible numbers. But these numbers 
were known not to be rational numbers in many cases. One could say that solutions 
don’t exist in those cases, or that there is a larger realm of numbers (for example 
fields like Q (V2) or the reals), and the latter view only began to emerge in the last 
three centuries. 

Of course the quadratic formula is not valid over fields of characteristic 2 since 
dividing by 2 was used in deriving and expressing the formula. The same occurs for 
the solutions of the cubic and quartic equations: fields of characteristic 2 and 3 must 
be excluded. 

Dividing through by the coefficient of u*, the general cubic equation has the form 
u> + bu* +cu+d = 0, where u is the generic unkown root. Replacing u by x — b/3 
yields the simpler equation, 

eS qx+r=0. 


Next, one puts x = y + z to obtain 
yp+otGByztqxtr=0. (11.28) 


There is still a degree of freedom allowing one to demand that yz = —q/3, thus 
eliminating the coefficient of x in (11.28). Then z? and y* are connected by what 
amounts to the norm and trace of a quadratic equation: 


P+y? =—-T, 


and 
By? = —q?/27. 


One can then solve for z° and y° by the quadratic formula. Taking cube roots (assum- 
ing that is possible) one gets a value for z and y. The possible zeroes of x? + gx +r 
are: 

yz, wy+ wz, wy + WZ, 


where w is a complex primitive cube root of unity. The formula seems to have been 
discovered in 1515 by Scipio del Ferro and independently by Tartaglia. 

Just thirty years later, when a manuscript of Cardan published the cubic formula, 
Ferrari discovered the formula for solving the general quartic equation. Here one 
seeks the roots of a monic fourth-degree polynomial, whose cubic term can be elim- 
inated by a linear substitution to yield an equation 


x? 4 qbx* +rx+s=0. 
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We would like to re-express the left side as 
(x7 —kx +u)(x? +kx +0), 


for suitable numbers k, u and v. From the three equations arising from equating coef- 
ficients, the first two allow u and v to be expressed in terms of rk, and a substitution 
of these expressions for u and v into the third equation produces a cubic in k”, which 
can be solved by the cubic formula. Then u and v are determined and so the roots 
can be obtained from two applications of the quadratic formula. 

It is hardly surprising, with all of this success concentrated in the early sixteenth 
century, that the race would be on to solve the general quintic equation. But this 
effort met with complete frustration for the next 270 years. 

Looking back we can make a little better sense of this. First of all, where are we 
finding these square roots and cube roots that appear in these formulae? That can be 
answered easily from the concepts of this chapter. If we wish to find the nth root of 
a number w in a field F,, we are seeking a root of the polynomial x” — w € F[x]. 
And of course we need only form the field F[a] ~ F[x]/p(x)F[x] where p(x) is 
any irreducible factor of x” — w. But in that explanation, why should we start with 
a general field F'? Why not the rational numbers? The answer is that in the formula 
for the general quartic equation, one has to take square roots of numbers which 
themselves are roots of a cubic equation. So in general, if one wishes to consider the 
possibility of having a formula for an nth degree equation, he or she must be prepared 
to extract pth roots of numbers which are already in an uncontrollably large class of 
field extensions of the rational numbers, or whatever field we wish to begin with. 

Then there is this question of actually having an explicit formula that is good 
for every nth degree equation. To be sure, the sort of formula one has in mind 
would involve only the operations of root extraction, multiplication and addition and 
taking multiplicative inverses. But is it not possible that although there is not one 
formula good for all equations, it might still be possible that roots of a polynomial 
f(x) € F[x] can be found in a splitting field K which is obtained by a series of 
root extractions—put more precisely, K lies in some field F;- which is the terminal 
member of a tower of fields 


FHho<fFf,<::-<Ff,, 


where Fj, = F;(a;) and ai; € Fj, for positive integers nj,i = 0,...,7— 1? In this 
case we call F, a radical extension of F . Certainly, if there is a formula for the roots 
of the general nth degree equation, there must be a radical extension F,/F which 
contains a copy of the the splitting field K ¢(,) for every polynomial f(x) of degree 
n. Yet something more general might happen: Conceivably, for each polynomial 
f(x), there is a radical extension F¢/F, containing a copy of the splitting field of 
Ff (x), which is special for that polynomial, without there being one universal radical 
extension which does the job for everybody. Thus we say that f(x) is solvable by 
radicals if and only if its splitting field lies in some radical extension. Then, if there 
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is a polynomial not solvable by radicals, there can be no universal formula. Indeed 
that turned out to be the case for polynomials of degree at least five. 

This field-theoretic view-point was basically the invention of a French teenager, 
Evariste Galois. His account of it appears in a final letter frantically written the night 
before he died in a duel, just before his twenty first birthday, we are told.'! The 
discovery of his theory might have been delayed for many decades had this letter not 
come to light some years after his death.'* For further reading on Galois, the authors 
recommend the following references: 


Livio, Mario, The Equation That Couldn’t Be Solved: How Mathematical Genius 
Discovered the Language of Symmetry, Simon and Schuster, New York, 2005 


Rothman T., “The short life of Evariste Galois”, Scientific American, 246, no.4, 
(1982), p. 136. 


Rothman, Tony, “Genius and biographers: the fictionalization of Evariste Galois,” 
Amercan Math. Monthly, 84 (1982), p. 89 


The next sections will attempt to describe this theory. 


11.9.2 Roots of Unity 


Any zero of x” — | € F[x] lying in some extension field of F is called an nth root 
of unity. If F has characteristic p, a prime, we may writen = m - p“ where p does 
not divide m and observe that 


x® —1= (x™ —1)?"” (11.29) 


Thus all nth roots of unity are in fact mth roots of unity in this case. 

Of course if a and ( are nth roots of unity lying in some extension field of K 
of F, then (a8)” = a”"B" = 1 so af is also an nth root of unity. Thus the nth 
roots of unity lying in K always form a finite multiplicative subgroup of K*, which, 
by Corollary 11.2.9, is necessarily cyclic (of order m, if n = mp* as above). Any 
generator of this cyclic group is called a primitive nthroot of unity. 

Suppose now that n is relatively prime to the characteristic p of F’, or that F has 
characteristic zero. Let K be the splitting field of x” — 1 over F. Then x” — 1 is 
relative prime to its formal derivative, as nx"—! 4 0. Therefore, in this case x” — 1 
splits completely in K [x]. But even if F had positive characteristic p and ifn = mp* 
where p does not divide m, the nth roots of unity are, by Eq. (11.29), just the mth 
roots of unity, and so the splitting field of x” — 1| is just the splitting field of the 


‘There is some doubt about this, but his birthday nonetheless seems to be near this date. 


!2 What appears to be an early hand-written copy of this letter (in French) now exists as a photocopy in 
some American University libraries, for example, the Morris Library at Southern Illinois University. 
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separable polynomial x” — 1. Thus in either case, the splitting field K of x” — 1 over 
the field F is separable (and, of course normal), and hence is Galois over F’. 

We denote the splitting field over F of x” — 1 by K, and set G = Gal(K/F). 
Since K is generated over F by the nth roots of unity, and since G clearly acts on 
these roots of unity, this action is necessarily faithful. Finally, it is clear that G acts 
as a group of automorphisms of the (cyclic) group of nth roots of unity; since the 
full automorphism group of a cyclic group is abelian, G is abelian, as well. 

We summarize these observations in the following way: 


Lemma 11.9.1 Let n be any positive integer and let F be any field. Let K be the 
splitting field of x" — 1 over F.. Then the following hold: 


(i) K is separable over F and is a simple extension K = F[C], where ¢ is a 
primitive nth root of unity. 
(ii) Gal(K /F) is an abelian group. 


11.9.3 Radical Extensions 


Suppose n is a positive integer. We say that a field F contains all nth roots of unity 
if and only if the polynomial x” — 1 splits completely into linear factors in F [x]. 


Lemma 11.9.2 Let F be a field, let n be a positive integer, and suppose F contains 
all nth roots of unity. Suppose a is an element of some extension field of F, where 
a” € F. Then 


(i) the simple extension F[a] is normal over F, and 
(ii) the Galois group Gal(F|a]/F) is cyclic of order equal to a divisor of n. 


Proof Note that if K is a splitting field over F for f(x) = x” — a” € F[x], 
then for any zero a’ € K of f(x), we have (a’/a)”" = 1, and so, by hypothesis, 
n = a’/a € F,and soa’ = na € F(a), proving that K = F(a), and so F(a) isa 
splitting field over F for x” — a”, hence is normal. 

Next, the multiplicative group of the nth roots of unity in F is a cyclic group; 
let ¢ be a generator of this group. Where G = Gal(F'(a)/F) is the Galois group 
of F(a) over F, we define a mapping ¢ : G — (C) as follows. We know that 
if o € G, then o(a) = C'va, for some index is; we set d(c) = Ce, Next, if 
a,o' € G, then go'(a) = a(C'o'a) = Co'o(a) = Clo’ Cea, and so it follows that 
d(aa') = Clo’ Cie = Cielo! = H(a)G(0’), i.e., d is a homomorphism of groups. If 
o € ker@, then o(a) = a. But then, o(C/a) = C/o(a) = C/a, forcing o = 1. 
Therefore, ¢ embeds G as a subgroup of the cyclic group (¢), proving the result. 


Recall from p. 401 that a polynomial p(x) € F[x] is said to be solvable by 
radicals if and only its splitting field lies in a radical extension of F. 


Theorem 11.9.3 Suppose p(x) is a polynomial in F(x] which is solvable by radi- 
cals. Then if E is a splitting field for p(x) over F, Gal(E/F) is a solvable group. 
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Proof By hypothesis, there is a radical extension F = Fo C F; C --- C F, such 
that E C F,. Assume that F; = F;_1(as), 5 = 1,2,...,1r, where each al’ € Fy_1. 
Let m be the least common multiple of j), j2,..., j-, and let K be a splitting field 
over F,. of x” — 1. It suffices to show that Gal(K /F) is solvable, for then Gal(E/F) 
would be a quotient of Gal(K /F), and hence would also be solvable. 

We argue by induction that Gal(K/F) is solvable. Let 7 be a generator for the 
cyclic group of mth roots of unity in K. We have the tower 


FOCFMCHMMS:- CRAM CRM =K OE. 


By Lemma 11.9.2 part (i) each of the intermediate subfields Fj(7) is normal over F’. 
We get a short exact sequence 


1 > Gal(F,()/F,—1()) > Gal(K/F) — Gal(F,1(n)/F) > 1. 
By Lemma11.9.2 part (ii) Gal(F,(7)/F;—1(7)) is solvable, and by induction 


Gal(F;--1(7)/F) is solvable. It follows that Gal(K/F’), is solvable. The proof is 
complete. 


11.9.4 Galois’ Criterion for Solvability by Radicals 


In this subsection we will prove a converse to Theorem | 1.9.3 for polynomials over a 
ground field in characteristic zero. (In characteristic p the converse statement is not 
even true.) In order to do this, we first require a partial converse to Lemma 11.9.2. 


Lemma 11.9.4 Let F C K be a Galois extension of prime degree q, and assume 
that the characteristic of F is distinct from q. Assume further that x4 — 1 splits 
completely in F[x]. Then K is a simple radical extension of F—that is K = F[¢] 
where C4 € F. 


Proof Our assumption about the characteristic guarantees that the polynomial x4 — 1 
splits completely into q distinct linear factors in F[x]. Let z1, z2,..., Zq be the 
distinct gth roots of unity in F. We know that the Galois group G = Gal(K/F) 
has order dividing the prime gq, and so it is cyclic. Fix a generator 0 € G, so that 
G = (a) where of = 1 € G. (Note that o may be the identity automorphism.) Now, 
for each element 0 € K\F we set 6; = ao'—!(), i = 1,2,..., q and define the 
corresponding Lagrange resolvents: 


(zi, 0) = 01 + 022; + 0327 +--+ + Ogzt | (11.30) 


q . 
= >of "7 (11.31) 
j=l 
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g-1 


= > 04 (6)z. (11.32) 
j=0 


Note that o(z;, 0) = 02 + 03z; +---+ zt! = Z (tis @), and so it follows that 
a(zj, 0)4 = (z;, 8)4%, which says, of course, that (z;, 0)? € F. Thus, we’ll be finished 
as soon as we can show that one of the resolvents (z;, 0) ¢ F. We display Eq. (11.30) 
in matrix form: 


-1 
Lago at Pa (<1, 8) 
i ae vb) (z2, 8) 
t zy Z a 04 (Zq, 9) 


As we have already observed, the gth roots of unity are all distinct (by our assump- 
tion that the characteristic of F does not divide qg), and so by Corollary 11.2.8, this 
Vandermonde matrix on the left can be inverted to solve for each of the coefficients 
61, 02,..., 0g as F-linear combinations of the resolvents (z1, 9), i = 1,2,...,q. 
Since 6; = 0 ¢ F, we conclude immediately that at least one of the resolvents 
(zi, 9) ¢ F, and we are finished. 


We are now ready for the main result of this section. 


Theorem 11.9.5 Suppose F is a field of characteristic zero and that f(x) is a 
polynomial in F[x]. Let K be the splitting field of f (x) over F. If G = Gal(K/F) 
is a solvable group then the equation f(x) = 0 is solvable in radicals. 


Proof Let E be a splitting field over K for the polynomial x” — 1, where n = |G|, 
and let ¢ be a primitive nth root of unity so that E = K(¢). Now both K and and 
FC) are splitting fields and so E is a normal extension of each of the subfields. 
These extensions are Galois extensions since they are also separable for the reason 
that all fields in sight have characteristic zero. 

Since K is invariant under Gal(EZ/F), there is a restriction homomorphism 
o : Gal(E/F) — G,o + olx, with kernel Gal(E/K), mapping each F- 
automorphism of E to the automorphism it induces on K. Let H = Gal(E/F(¢)),a 
normal subgroup of Gal(E/ F’). The restriction of the homomorphism ¢ to H is injec- 
tive since any element 7 € H fixing K point-wise fixes ¢ and hence fixes E = K (¢) 
point-wise. Thus @ embeds H into the solvable group G = Gal(K/F), and so H 
itself is solvable. 

Next, form the subnormal series 


H=HAEF ME: b A, = 1, 


where each quotient Hj /Hj+1, i = 0,1,...,m — 1, is cyclic of prime order. Next 
set E; = invy,(£); then E;+; is Galois over E; and Gal(£;+1/E;) = Hj /Hi+1, 
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which is cyclic of prime order. Apply Lemma 11.9.4 to infer that E;+) is a radical 
extension of £;, 1 = 0,1,...,m— 1. Since the field at the bottom of this chain, 
Eo = invy(E£) = F(C) is patently a radical extension of F,, we see that EF is a radical 
extension of F. The result now follows. 


Corollary 11.9.6 (Galois’ solvability criterion) Assume f(x) € F[x] where F has 
characteristic zero. Then the polynomial f (x) is solvable by radicals if and only if 
the Galois group Gal(K /F) of the splitting field K for f (x) is a solvable group. 


Proof This is just a combination of Theorems 11.9.3 and 11.9.5. 


Example 56 Let E = GF(2) and let F = E(x) be the field of rational functions 
in the indeterminate x. Let f(y) = y? + xy + x € F[y], irreducible by Eisenstein 
(see Exercise (8) in Sect. 9.13.1, p. 318) and separable as Of (y) = x 4 0. Thus, the 
splitting field K over F for f(y) is a Galois extension of degree 2, and therefore has 
Galois group G = Gal(K/F) cyclic of order 2. We shall argue, however, that f(y) 
is not solvable by radicals. Thus, assume that F = Fo C Fi C---C F, D Kisa 
root tower over F, say F; = Fj—\(a;), where < € F;_,. Next let m be the least 
common multiples of the exponents n1,72,...,n,-, and let L D F, be the splitting 
field over F;- for the polynomial x” — 1. Let ¢ be a generator of the cyclic group of 
zeros of x” — Lin L. 

Note that F(¢) = E(¢)(x) and—again using Eisenstein— f(y) continues to be 
separable and irreducible over F (¢). Therefore Gal(F (¢)(a@)/F (¢)) is cyclic of order 
2. We set E’ = E(C), a finite field of characteristic 2 over which the polynomial 
x” — | splits completely. At the same time, we set a = F;(¢), i =0,1,...,7r,s0 
that Fy = F(¢) = E’(). 

Next, note that if n, = m2°, where m is odd, then set b, = a> and obtain the 
subextension: 

Fnn4 c F,_,(b) iS F,_,(@r) = ie 
By Corollaries 11.5.7 and 11.5.11 we may conclude that F’_,(b;) is precisely the 
subfield (Fy) sep of F,. If w € K is a zero of f(y), then a is separable over F’, and 
hence is separable over every subfield of F’. This puts a € (Fy) sep = Fi (br). 
Next, as F/_, contains all nth roots of unity, we may apply Lemma 11.9.2 to infer 
that Gal(F’_, (br) if Fo) is not only cyclic but is of order a divisor of m, which is 
odd. But, since a is separable over F’, it is separable over F’_,. As a satisfies a 
polynomial of degree 2, and since 
Fy Sc F’_,(a) Sc F’_ (by), 

we conclude immediately that a € F/_,. Continuing this argument, we eventually 
reach the conclusion that a € Fj) = F’, which is false. Therefore the polynomial 
f(y) = y? +xy +x € F[x] is not solvable by radicals despite having a Galois 
group which is cyclic of order 2. 
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11.10 The Primitive Element Theorem 


It seems to be a tradition that a course in higher algebra must include the “primitive 
element theorem” (Theorem 11.10.2 below). Of course it is of interest, but its use- 
fulness seems vastly over-rated. It is not used to prove any further theorem in this 
book and the reader who has better things to do is invited to skip this subsection. 

In many of our concrete discussions of splitting fields of polynomials, we obtained 
such fields over the base field by adjoining several field elements. For example, an 
oft-quoted example is that of the splitting field K over the rational field Q of the 
irreducible polynomial x* — 2 € Q[x]. Since the (complex) zeros of this polynomial 
are tw, +iw, where w = 2, the definition of splitting field might compel us to 
describe K by listing all of the zeros, adjoined to Q: K = Q(w, —w, iw, —iw). 
However, it’s obvious that the second and fourth roots listed above are superfluous, 
and so we can write K more simply as K = Q(w, iw). Of course, there are variants 
of this representation, as a moment’s thought reveals that also K = Q(i, w). At this 
stage, one might inquire as to whether one can write K as a simple extension of Q, that 
is, as K = Q(a), for some judiciously chosen element a € K. In this case, we shall 
take a somewhat random stab at this question and ask whether K = Q(i + w). In this 
situation, we can argue in the affirmative, as follows. If G = Gal(K /Q), andifa € G, 
then clearly oi + w) = +i tw, or o(@i + w) = +i + iw. However, one concludes 
very easily that the elements i, w and iw are Q-linearly independent. Therefore, 
o(itw) =i+wifand only if o = 1. That is to say, Gal(K /Q(G + w)) = 1, forcing 
Qti+w) = K, and we have succeeded in representing K as a simple extension of Q. 

Henceforth, if F C K is a simple field extension, then any element a € K with 
K = F(a) is called a primitive element of K over F. 

The theorem below gives a somewhat surprising litmus test for a finite extension 
to be a simple field extension. 


Theorem 11.10.1 Let F C K be a field extension of finite degree. Then K has a 
primitive element over F if and only if there are only a finite number of subfields 
between F and K. 


Proof Assume first that K = F(a) for some a € K. Set f(x) = Itrr(a), and 
assume that E is an intermediate subfield: F C E C K. Let g(x) = Irrg(a); 
clearly g(x)| f(x). Furthermore, if E’ C K is the subfield generated over F by the 
coefficients of g(x), then it is clear that E’ C E. Furthermore, we also have that 
g(x) = Itrg’(a), and so 


[K : E] = deg g(x) =[K : E'], 


forcing E = E’. Therefore, we see that subfields of K containing F are generated by 
coefficients of factors of f(x) in K [x]. There are clearly finitely many such factors, 
so we are finished with this case. 

Conversely, assume that there are only finitely many subfields of K containing 
F.As[K: F] < &, we clearly have K = F(aj, a2,..., a,) for suitable elements 
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Q1,Q2,...,@, € K. By induction, we shall be finished provided that we can show 
that for any pair of elements a, 3 € K, F(a, 3) has a primitive element over F. 
Clearly, we may assume that the field F is infinite. Our hypothesis therefore implies 
that there must exist elements a 4 b € F with F(a+ a) = F(a+ bf). But then, 


B = (a—b) '(a+aB—a— bP) € F(ataf). 


Thus a = a+aG—af € F(a+a{). Thus F(a, 8) = F(a + b£), proving the 
desired result. 


Remark The assumption above that F C K be a finite-degree extension is crucial. 
Indeed, if F is any field, if x is indeterminate over F, then K = F(x) has infinitely 
many subfields, including F(x"), n = 1,2,.... 


From Theorem 1 1.10.1 we extract the following main result. 


Theorem 11.10.2 (Primitive Element Theorem) Let F C K be a finite-degree, 
separable field extension. Then K contains a primitive element over F. 


Proof Given that the extension F C K has finite degree and is separable, we 
may write K = F(a), a2,...,a@;) where fj(x) = Irrr(aj), i = 1,2,...,r are 
separable polynomials. Therefore, if E' is a splitting field over F for the product 
Si) fax) --- f-(x), then E is Galois over F, is of finite degree over F', and hence 
its Galois group G = Gal(E/F) isa finite group. Since the subfields of EF containing 
F are in bijective correspondence with the subgroups of G, we see, a fortiori, that 
there can only be finitely many subfields between F and K. Now apply Theorem 
1T.00.1. 


11.11 Transcendental Extensions 


11.11.1 Simple Transcendental Extensions 


Suppose K is some extension field of F. Recall that an element a of K is said to 
be algebraic over F if and only if there exists a polynomial p(x) € F[x] such that 
p(a) = 0. Otherwise, if there is no such polynomial, we say that a is transcendental 
over F and that F(q) is a simple transcendental extension of F .. 

Recall further that a rational function over F is any element in the field of quotients 
F (x) of the integral domain F (xj. Thus every rational function r(x) is a quotient 
J (x)/g(x) of two polynomials in F[x], where g(x) is not the zero polynomial, and 
f(x) and g(x) are relatively prime. For each element a € K, we understand r(q) to 
be the element 


'3Note that round parentheses distinguish the field F(x) from the polynomial ring F[x]. 
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r(a) := f(a): (g(a)) |, 


provided a is not a zero of g(x). If a is transcendental over F, then r(q@) is always 
defined. 


Lemma 11.11.1 /f F(a) is a transcendental extension of F, then there is an iso- 
morphism F(a) — F(x) taking a to x. 


Proof Define the mapping € : F(x) — F(a) by setting e(r(x)) := r(qa), for every 
r(x) € F(x). Clearly € is a ring homomorphism. Suppose 


c(x) 
d(x)’ 


r(x) = 


a(x) _ 
B(x)’ r2(x) = 


are rational functions with r(@) = r2(a@). We may assume a(x), b(x), c(x) and d(x) 
are all polynomials with b(x)d(x) 4 0. Then 

a(a)d(a) — b(a)c(a) = 0. 
Since ais transcendental, a(x)d(x) = b(x)c(x) sor) (x) = r2(x). Thus € is injective. 


But its image is a subfield of F(a) containing F and the element €(x) = a and so it 
is also surjective. Thus € is bijective, and so €~! is the desired isomorphism. 


Now suppose F'(q) is a simple transcendental extension of F, and let {3 be any 
element of F(a). By Lemma!1.11.1, there exist a coprime pair of polynomials 
(f(x), g(x)), with g(x) 4 0 € F[x], such that 


B= f(a)g(a)). 
Suppose 


f(x) = ag + ayx + +++ + apx" 
g(x) = bo +byx +--+ + byx” 


where 7 is the maximum of the degrees of f(x) and g(x), so at least one of ay or by 
is non-zero. Then f(a) — Gg(a) = 0, so 


(ao — Bbo) + (a1 — Bhi )a +--+ + (an — Bbn)a" = 0. 
Thus a is a zero of the polynomial 
hg(x) := (ao — Bbo) + (a1 — Bbi)x + +++ + (an — Bbn)x" 


in F()[x]. 
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Claim The polynomial hg(x) is the zero polynomial if and only if 8 € F. 


First, if 6 € F, then hg(x) = f(x) — Gg(x) is in F[x]. Ifhg(x) were a non-zero 
polynomial, then its zero, a, would be algebraic over F, which it is not. Thus h g(x) 
is the zero polynomial. 

On the other hand, if hg(x) is the zero polynomial, each coefficient a; — 3b; is 
zero. Now, since g(x) is not the zero polynomial, for some index j, bj; 4 0. Then 
B=a;- re an element of F. 

The claim is proved. 


Now suppose 73 is not in F. Then the coefficient (a; — (3b;) is zero if and only 
if a; = 0 = bj, and this cannot happen for i = n as n was chosen. Thus hg(x) 
has degree n. So certainly a is algebraic over the intermediate field F(3). Since 
[F (a) : F] is infinite, and [F (a) : F(Q)] is finite, [F (3) : F] is infinite, and so ( is 
transcendental over F’. 

We shall now show that h(x) is irreducible in F(@)[x]. 

Let D := F[y] be the domain of polynomials in the indeterminate y, with coef- 
ficients from F’. (This statement is here only to explain the appearance of the new 
symbol y.) Then Fy) is the quotient field of D. Since (3 is transcendental over F, 
Lemma !1.11.1 yields an F-isomorphism 


ei FO) > Fy) 


of fields which can be extended to a ring isomorphism 
e*: F(A)[x] > F(y)[x]. 


Then h(x) := e*(hg(x)) = f(x) — yg(x) is a polynomial in D[x] = F[x, yl. 
Moreover, h g(x) is irreducible in F (3)[x] if and only if h(x) is irreducible in F (y)[x]. 
Now since D = F[y] is a UFD, by Gauss’ Lemma, h(x) € D[x] is irreducible in 
F (y)[x] if and only if it is irreducible in D[x]. Now, D[x] = Fy, x] is a UFD and 
h(x) is degree one in y. So, if h(x) had a non-trivial factorization in D[x], one of the 
factors k(x) would be of degree zero in y and hence a polynomial of positive degree 
in F[x]. The factorization is then 


A(x) = f(x) — yg(x) = k(x)(ax) + yb), 


where k(x), a(x) and b(x) all lie in F[x]. Then f(x) and g(x) possess the common 
factor k(x) of positive degree, against the fact that f(x) and g(x) were coprime in 
F(x]. We conclude that h(x) is irreducible in F(y)[x], and so hg(x) is irreducible 
of degree n in F(3)[x] as promised. 

We now have the following 


Theorem 11.11.2 (Liiroth’s Theorem) Suppose F(a) is a transcendental extension 
of F’. Let 3 be an element of F(a) — F. Then we can write 3 = f(a)/g(a) where 
J (x) and g(x) are a pair of coprime polynomials in F|x], unique up to multiplying 
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the pair through by a scalar from F.. Let n be the maximum of the degrees of f (x) 
and g(x). Then (F(a) : F(3)] = n. In particular, a is algebraic over F (3). 


Remark A general consequence is that if E is an intermediate field distinct from F— 
that is, F < E < F(a)—then [E : F] is infinite. In particular, F(a@)—F contains 
no elements which are algebraic over F (a fact that has an obvious direct proof from 
first principles). 


Liiroth’s theorem gives us a handle on the Galois group Gal(F(a)/F) of a tran- 
scendental extension. Suppose now that o is an F-automorphism F(a) > F(a). 
Then o is determined by what it does to the generator a. Thus, setting 0 = a(a), 
for any rational function t(x) of F, we have o(t(@)) = t(@). Now in particular 
6 =r(q) fora particular rational function r(x) = f (x)/g(x) where (f (x), g(x)) is 
a coprime pair of polynomials in F [x]. Since a is onto, F(@) = F(@). On the other 
hand, by Litiroth’s Theorem, [F(a@) : F(3)] is the maximum of the degrees of f(x) 
and g(x). Thus we can write 

aa+b 


= ca+d’ 


(11.33) 
where a, b, c and d are in F, a and c are not both zero, and ad — bd # 0 to keep the 
polynomials ax + b and cx + d coprime. Actually the latter condition implies the 
former, that a and c are not both zero. 

Now let G = Gal(F(a)/F) and set O := a®, the orbit of a under G. Our 
observations so far are that there exists a bijection between any two of the following 
three sets: 


1. the elements of the group Gal(F'(a)/F), 
2. the orbit O, and 
3. the set LF(F) of so-called “linear fractional” transformations, 


aw +b 
9 
cwu+d 


a,b,c,d € F, ad — cb non-zero, 


viewed as a set of permutations of the elements of O. 


A perusal of how these linear factional transformations compose shows that LF(F’) 
is a group, and that in fact, there is a surjective group homomorphism 


GL(2, F) > LF(2, F), 


defined by 


ab : az+b 
— | the transformation z > 
cd cz+d 


on set 0 | 


where the matrix has non-zero determinant—i.e. is invertible. The kernel of this 
homomorphism is the subgroup of two-by-two scalar matrices y/, comprising the 
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center of GL(2, F/). As the reader will recall, the factor group, GL(2, F')/Z(GL(2, F)) 
is called the “one-dimensional projective group over F” and is denoted PGL(2, F). 
Our observation, then, is that 


Gal(F (a)/F) ~ LF(F) ~ PGL(, F). 


Notice how the fundamental theorem of Galois Theory falls to pieces in the case 
of transcendental extensions. Suppose F is the field Z/pZ, the ring of integers 
mod p, a prime number. If a is transcendental over F,, then [F'(q@) : F'] is infinite, 
while Gal(F'(a)/F) is PGL(2, p), a group of order p(p? — 1) which, when p > 5, 
contains at index two, a simple subgroup. 

Let K be an extension of the field F’. We define a notion of algebraic dependence 
over F onthe elements of K as follows: We say that element a algebraically depends 
(over F) on the finite subset {G,,..., Gn} if and only if 


e a is algebraic over the subfield F(3,,..., Bn). 


Theorem 11.11.3 Algebraic dependence over F is an abstract dependence relation 
on K in the sense of Chap. 2, Sect. 2.6. 


Proof We must show that the relation of algebraic dependence over F satisfies the 
three required properties of a dependence relation: reflexivity, transitivity and the 
exchange condition. 

Clearly if 8 € {(G1,..., Gn}, then G lies in F(G),..., Gn) and so is algebraic over 
it. So the relation is reflexive. 

Now suppose ¥ is algebraic over F(a ,..., Qm), and each a; is algebraic over 
F (61, ..., Gn). Then certainly, a; is algebraic over the field 


F(Q1,.--5 Bas O1,+++, Qi-1). 
So, setting B := {(1,..., Gr}, we obtain a tower of fields, 


F(B) < F(B,a\) < F(B, a}, a2) <--- 
< F(B,a,...,Qn) < F(B1,..., Bm, 1, ..-, On, Y) 


with each field in the tower of finite degree over its predecessor. Thus the top field in 
the tower is a finite extension of the bottom field, and so y is algebraic over F(B). 

We now address the exchange condition: we are to show that if + is algebraic over 
F((1,..-, Gn} but is not algebraic over F (31, ..., Gn—1}, then J, is algebraic over 
the field F(G,,..., Gn—1, y}.!4 


'41¢ is interesting to observe that in this context the Exchange Condition is essentially an “implicit 
function theorem”. The hypothesis of the condition, transferred to the context of algebraic geom- 
etry, says that the “function” + is expressible locally in terms of {(1,..., Bn} but not in terms of 
{G1,..., Bn—1}. Intuitively, this means a # 0, from which we anticipate the desired conclusion. 
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By hypothesis, there is a polynomial relation 


Nees by ag (11.34) 


0 = ayy* + ay_1y 
where ax is non-zero, and each a; is in F (G1, ..., Bn), and so, at the very least, can be 
expressed as a rational function, nj /d;, where the numerators n ; and denominators 
d; are polynomials in F[(1, ..., G,]. We can then multiply Eq. (11.34) through by 
the product of the d;, from which we see that without loss of generality, we may 
assume the coefficients a; are polynomials—explicitly: 


a= Dy Push. bij € F[Bi,.--,Gn-1], a #0. (11.35) 


Then substituting the formulae of Eq. (11.35) into Eq. (11.34), and collecting the 
powers of 3, we obtain the polynomial relation 


0 = PmOn + Pm—1Bn | +++ P1Bn + Pos (11.36) 
with each coefficient : 
P= etl (11.37) 
a polynomial in the ring F[(@1,..., Bn—1, 7]. 
Now we get the required algebraic dependence of 3, on {(1,..., Gn—1, y} from 
Eq. (11.36), provided not all of the coefficients p; are zero. Suppose, by way of 
contradiction, that they were all zero. Then as y does not depend on {/1,..., Bn—1}, 


having each of the expressions in Eq. (11.37) equal to zero forces each coefficient bs; 
to be zero, so by Eq. (11.35), each a; is zero. But that is impossible since certainly 
dx 18 non-zero. 

Thus the Exchange Condition holds and the proof is complete. 


So, as we have just seen, if K is an extension of the field F, then the relation of 
being algebraically dependent over F is an abstract dependence relation, and we call 
any subset X of K which is independent with respect to this relation algebraically 
independent (over F ). By the basic development of Chap. 1, Sect. 1.3, maximal alge- 
braically independent sets exist, they “span” K, and any two such sets have the same 
cardinality. We call such subsets of K transcendence bases of K over F, and their 
common cardinality is called the transcendence degree of K over F’, and is denoted 


trdeg(K/F). 


If X is a transcendence basis for K over F, then the “spanning” property means 
that every element of K is algebraic over the subfield F(X) generated by X. One 
calls K(X) a “purely transcendental” extension of F’,, although this notion conveys 
little since it is completely relative to X. For example, if x is a real number that is 
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transcendental over Q, the field of rational numbers, then both Q(x) and Q(x?) are 
purely transcendental extensions of Q, while the former is algebraic over the latter.!° 

Next we observe that transcendence degrees add (in the sense of cardinal numbers) 
under iterated field extension just as ordinary degrees multiply. Specifically: 


Corollary 11.11.4 Suppose we have a tower of fields 
F<K<L. 


Then 
trdeg|K : F] + trdeg[L : K] = trdeq[L: F]. 


Proof Let X be a transcendence basis of K over F and let Y similarly be a transcen- 
dence basis of L over K. Clearly then, X M Y = G, every element of K is algebraic 
over F(X), and every element of L is algebraic over K(Y). 

The corollary will be proved if we can show that X U Y is a transcendence basis 
of L over F. 

First, we claim that X U Y spans L—that is, every element of L is algebraic over 
F(X UY). Let A be an arbitrary element of L. Then there is an algebraic relation 


Na A! cat ay +a =0, 


where m > 1, and the coefficients a; are elements of K (Y). This means that each of 
them can be expressed as a quotient of two polynomials in K[Y]. Let B be the set of 
all coefficients of the monomials in Y appearing in the numerators and denominators 
of these quotients representing a;, as i ranges over 0, 1,...,m. Then B is a finite 
subset of K, and \ is now algebraic over F(X U Y U B). Since each element of B is 
algebraic over F(X), F(X U B) is a finite extension of F(X), and so F(X UY UB) 
is a finite extension of F(X U Y). It then follows that F(X U Y U B, A) is a finite 
extension of F(X U Y), and so J is algebraic over F(X U Y) as desired. 

It remains to show that X U Y is algebraically independent over Y. But if not, 
there is a polynomial p € F[z1,...,Zn+m] and finite subsets {x1,..., x} and 
{y1,---, Yn} of X and Y, respectively, such that upon substitution of the m x; for the 
first m z;’s, and the y; for the remaining n z;’s, one obtains 


O = p(x1,..-, Xm, V1,++- Ym). 


Now this can be rewritten as a polynomial whose monomial terms are monomials 


in the y;, with coefficients which are polynomials cg in F[z1,..., Z,] evaluated 
at x1,...,Xn. Since the y; are algebraically independent over K, the coefficient 
polynomials cg, are each equal to zero when evaluated at x1,...,X,. But again, 


'STn this connection, there is a notion that is very useful in looking at the subject of Riemann 
Surfaces from an algebraic point of view: A field extension K of F of finite transcendence degree 
over F is called an algebraic function field if and only if, for any transcendence basis X, K is a 
finite extension of F(X). 
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since the x; are algebraically independent over F’, we see that each polynomial cx 
is identically 0. This means that the original polynomial p was the zero polynomial 
of F[z1,..-, Zm+n], establishing the algebraic independence of X U Y. The proof is 
complete. 


Corollary 11.11.5 An algebraic extension of an algebraic extension is algebraic. 
Precisely, if F < K < Lisa tower of fields with K an algebraic extension of F and 
L an algebraic extension of K, then L is an algebraic extension of F. 


Proof This follows from the previous theorem and the fact that zero plus zero is 
zero. 


Remark Before, we knew this corollary anyway, but with the word “finite degree” 
replacing the word “algebraic” in the statement. 


11.12 Exercises 


11.12.1 Exercises for Sect. 11.2 


1. Compute the minimal polynomials in Q[x] of each of the following complex 
numbers. (Recall that if a is a positive real number and n is a positive integer, 
the symbol 2/a refers to the positive nth root of a.) 


(a) ¥2+ V3. 

(b) V2 + ¢, where ¢ = e?7//3, 
(c) J2+V2 

(d) ¢+¢71, where ¢ = e?7//16 
(e) ¢+¢7!, where ¢ = e?7'/7 


2. Let F C K be a field extension with [K : F] odd. If a € K, prove that 
F(a’) = F(a). 

3. Assume that a = a + bi € C is algebraic over Q, where a is rational and b is 
real. Prove that Irrg(q) has even degree. 

4. Let K = Q(V2, V2) CC. Compute [K : Q]. 

5. Let f(x) = x° — 9x3 + 3x +3 € Q[x]. 


(i) Show that f(x) is irreducible over Q. 
(ii) Show that f(x) is irreducible over Q(i). 


6. Let K = Q(V2, i) € C. Show that 


(a) K contains all roots of x =-2¢€ Q[x]. 
(b) Compute [K : Q]. 
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7. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


Compute 
Q{y¥24+V724+v2]:Q 
. Let F C E C K be fields, let a € K, and let f(x) = Irrr(a). Assume that 


[E : F] and deg f(x) are relatively prime. Prove that f(x) = Irrg(a). 


. Let F be any field, and prove that there are infinitely many irreducible polyno- 


mials in F'[x]. [Hint: Euclid’s proof of the corresponding result for the ring Z 
works here, too.] 

Let F = C(x), where C is the complex number field and x is an indeterminate. 
Assume that F C K and that K contains an element y such that y? = x(x — 1). 
Prove that there exists an element z € F(y) such that F(y) = C(z), ie., Fy) 
is a “simple transcendental extension” of C. 

Let F C K bea field extension. If the subfields of K containing F are totally 
ordered by inclusion, prove that K is a simple extension of F’. (Is the converse 
true?) 

Let Q C K bea field extension. Assume that K is closed under taking square 
roots, i.e., if a € K, then./a € K. Prove that [K : Q] = co. 

Suppose the field F is a subring of the integral domain R. If every element of R 
is algebraic over F’, show that R is actually a field. Give an example of a non- 
integral domain R containing a field F such that every element of R is algebraic 
over F’. Obviously, R cannot be a field. 

Let F C K be fields and let f(x), g(x) € F[x] with f(x)|g(x) in K [x]. Prove 
that f(x)|g(x) in F[x]. 

Let F C K be fields and let f(x), g(x) € F[x]. If d(x) is the greatest common 
denominator of f(x) and g(x) in F[x], prove that d(x) is the greatest common 
denominator of f(x) and g(x) in K[x]. 

Let F C E\, Eo C E be fields. Define E,; Ey C E to be the smallest field 
containing both FE; and E>. EF; E> is called the composite (or compositum) of 
the fields FE, and E>. Prove that if [E : F] < oo, then [E, Eo: F] < [E£,: 
F)-[E2: FI. 

Given a complex number a it can be quite difficult to determine whether a is 
algebraic or transcendental. It was known already in the nineteenth century that 


7 and e are transcendental, but the fact that such numbers as e” and v2 are 
transcendental is more recent, and follows from the following deep theorem of 
Gelfond and Schneider: Let a and ( be algebraic numbers. If 


_ loga 
~ log 


is irrational, then 7 is transcendental. (See E. Hille, American Mathematical 
Monthly, vol. 49 (1942), pp. 654-661.) Using this result, prove that 2¥? and e* 
are both transcendental. [Hint: For qv? seta = Qv?, b=2.] 
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11.12.2 Exercises for Sect. 11.3 


_ 


. Let f(x) = x” — 1 € Q{4]. In each case below, construct a splitting field K 
over Q for f(x), and compute [K : Q]. 


(i) n = p, a prime. 
(ii) m = 2p~p, where p is prime. 
(iii) n = 6. 
(iv) n = 12. 
2. Let f(x) = x” — 2 € Q[x]. Construct a splitting field for f(x) over Q. 
. Let f(x) = x3 4+ x? — 2x —1 € Q[y]. 


(a) Prove that f(x) is irreducible. 

(b) Prove that if a € C is a root of f(x), so is ir 2, 

(c) Let K > Q be a splitting field over Q for f(x). Using part (b), compute 
[K : QI. 


4. Let¢ = e?™/7 € C, and let a = €¢+¢7!. Show that Irrg(a) = x3++.x7—-2x-1 
(as in Exercise 3 above), and that a* — 2 = (7 + ¢~?. 

5. Write the complex splitting field for x? —2 € Q[x] in the form Q(a), for some 
aeéC. 

6. Let ¢ = e?"/" € Cand setw = €+¢7!. Show that Q(w) is a normal extension 
of Q. 

7. Give an example of a normal extension Q C K such that [K : Q] = 3. 

8. Let F = C(x) be the field of rational functions over the complex field C and let 
f(y) = y?— x € F[y]. Let a be a zero of f(y) in some splitting field over F 
and show that F'(q@) is a normal extension of F’. 

9. Let E = Q(,/2), and let K be a normal closure of E over Q. Compute [K : Q]. 

10. Which of the following simple extensions of the rational field are normal? 


(a) Q¢/2 + v3), 
(b) Q(v2+V2), 


(c) o( D-ba/2 +¥3), 
(d) Q(¢/2), where € = e277/8. 


io) 


11.12.3 Exercises for Sect. 11.4 


1. Let g be a prime power, set F = GF(q) and let E > F be a field extension 
of degree n. Let o be the gth power automorphism of E (sometimes called the 
Frobenius map of E), given by F(a) = a4. Define the norm map 


N=Ngjrp:E—>F 
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by setting 
N(a) = a-o(a)-0*(a)---0" '(a). 


Note that N restricts to a mapping 
N: E* — F*, 


where E* and F™* are the multiplicative groups of no-zero elements of FE and F. 


(a) Show that N : E* — F* is a group homomorphism. 


(b) Show that |ker N| = c. 


. Let p beaprime and let r be a positive integer. Prove that there exists an irreducible 
polynomial of degree r over F = GF (p). [Hint: Isn’t this equivalent to the 
existence of an extension field K > F of degree r?] 

. Let p be prime, 7 a positive integer and set gq = p”. Let F = GF (p) and show 
that if f(x) € F[x] is irreducible of degree n, then f (x)|x? — x. More generally, 
show that if f(x) is irreducible of degree m, where m|n, then again, f (x)|x7 —x. 
. Show that if F = GF (gq), then the polynomial x? +x+ 1 € F[x] is irreducible 
if and only if 3 Jq — 1. 

. For any integer n, let D, be the number of irreducible polynomials of degree n 
in F[x], where F = GF(q). Prove that 


k|n 


[Hint: Note that, by Exercise 3,q” = >» kik k- Dx; now use the Mobuis Inversion 
Theorem (see Example 43, Theorem 7.4.3, p. 217).] 

. Here’s another application of Mébius inversion. Let F be a field and let C be a 
finite multiplicative subgroup of the multiplicative group F* of non-zero elements 
of F. We know C to be cyclic. Assume that |C| = n, and d|n, and let Ng be the 
sum in F of the elements of order d in C. Thus JN, is the sum in F of the ¢(n) 
generators of C. Prove that, in fact, 


Nn = p(n). 


[Hint: Study f(n) = >) Na; how does this relate to the polynomial x” — 1 € 
d\n 

F[x}?)] 

. Let F = GF(q), where gq = p” and p is an odd prime. 


1 ow that for any a € , and an rime power r = , one Nas 
(i) Show that for any a € F, and any prime p p*, one h 


(x — a)" 1_ yr To ax” 2a tax ta, 
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(ii) Suppose m is an integer between | and p — 1, inclusive. Prove that if a is 
a zero of a polynomial f(x) € F[x] with multiplicity exactly m, then a is 
a zero of the formal derivative f’(x) of multiplicity exactly m — 1. [Hint: 
Write f(x) = (x — a)g(x) where g(a) 4 0, then apply the product rule 
for the formal derivative. ] 

(111) Suppose 7 is as it is in part (ii) of this problem. For fixed non-zero a € F, 
define the polynomial 


p-l|. - = 
fin.a(X) 7= as xt QP, 


(When a = landm = (p—1)/2 these are knownas the Fekete polynomials.) 
Show that a is a root of fin, (x) with multiplicity exactly m. [Hint: Use (i) 
to observe that 

fora(x) +P! = (Xq)?™. 


Then observe that form > 1, fint1,a(%) = Xfjy.o(*) and apply part (ii) of 
this problem. ] 


11.12.4 Exercises for Sect. 11.5 


1. Letk = GF(3), and set F = k(x), the field of rational functions over k in x. Let 
fQ) =y8 t27y? +x € Fly]. 


(a) Show that f(y) is irreducible over F. 

(b) Let a be a root of f(y) in some splitting field over F. Is F(a) separable 
over F’? 

(c) If K = F(a) as above, determine [K : F'] and [Ksep : F]. 


2. Let F bea field of characteristic p. Determine the number of roots of the polyno- 
mial x” — 1 € F[x] in some splitting field. [Hint: write n = mp°, where p does 
not divide m.] 


11.12.5 Exercises for Sect. 11.6 


1. Let F C K bea finite extension of fields, and let E,, Ex be two intermediate 
subfields. Assume that F C E} is a Galois extension and that K is generated over 
F by E; and E2. Prove that 


(a) Ez C K is a Galois extension, and that 
(b) Gal(K /E2) = Gal(E,/E, 9 Ez). 
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2. Assume that F C K is a field extension of finite degree, and let EF, E2 be 
intermediate subfields, both Galois over over F’, and generating K. Prove that K 
is Galois over Ey and that Gal(K /E2) is isomorphic to a subgroup of Gal(E/F). 
(This is the so-called Lemma on Accessory Irrationalities.) 


11.12.6 Exercises for Sect. 11.8 


1. Using Mobius inversion, prove that 


©, (x) = [|e = hes. 


d\n 


2. Consider the polynomial x* + 3 € Q[x], and let K € C be its splitting field. 


(a) Show that K = Q(i, w), where w = C73, and where ¢ = e27//8. 

(b) Compute [K : Q]. 

(c) If we write the zeros of f(x) as aj = W,d2 = —w,a3 = iw,a4 = —iw, 
calculate the stabilizer in G of a), with its elements written in cycle notation. 
(So for instance, an element 0 = (2 3 4) would fix a; and do this: 


a2 > a3 a4 a2. 


Incidentally, does such an element exist in G? Why or why not?) 

(d) Is G abelian? Why or why not? 

(e) Assuming that you determined that G is, in fact, non-abelian, give a non- 
normal subfield E, withQCECK. 

(f) List all the elements of G, using cycle notation. 


3. Leta = /2+ V3, and set p(x) = Irrg(a). Compute the Galois group of p(x). 


4. Leta =24+V2+~V2, let q(x) = Irrg(), and compute the Galois group of 
q(x). 

5. Are the Galois groups of the polynomials x® — 2, x8 — 3, and x8 — 5 in Q[x] all 
isomorphic? Investigate. 

6. Let € = e?7/32, and set K = Q(0). 


(a) Show that K is a separable normal extension of Q. 

(b) Let a be as in Problem 4, above. Show that Q(a) C K and compute [K : 
Q(a)]. [Hint: Look at ¢ + ¢~!.] 

(c) Compute Irrg(¢) € Q[x]. 


7. Let f(x) = x© — 4x3 +1 € Q[y]. 


(a) Show that if w = PY 2+ J/3 is real and C= e27/3 & C, then the complex 
splitting field K of f(x) is Q(G, w). 
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(b) Show that f(x) is irreducible. [Hint: If it’s not, it has a linear, quadratic, or 
cubic factor. Use the factorization of f(x) € K[x] into linear factors and 
show that the product of at most three of these factors cannot be in Q[x].] 

(c) Label the zeros of f(x) as aj = w,a2 = (w,a3 = C’w,a4 = 
we, asCw!, a = ee me Show 

i. With the above correspondence, show that the following elements are 
in G: 
y = (23)(5 6) (complex conjugation), 


o = (123)(465), -=(14)(25)(3 6). 
ii. Show that 
Gal(K/Q(¢)) = (7,0), Gal(K/QW + w')) = (7,7), 
G=(7,0,T) =D, Z(G) = (97), 


where D}p is the dihedral group of order 12. 
iii. Show that Q C Q(/3) C K and that Gal(K /Q(/3)) = (7, 0). 
iv. Show that Q C Q() C K and that Gal(K /Q(i)) = (aT). 


8. Let f(x) = x® — 2x3 —2 € Q[x]. 


(a) Leta = V14 V3, B= V1— V3 ER, and let ¢ = e2"/3 € C. Show that 
the zeros of f(x) are Cia, €/8, j =0, 1,2. 

(b) Show that the complex splitting field K of f(x) over Q contains the field 
L = Qi, ¥2,V3). 

(c) Show that [ZL : Q] = 12. 

(d) Show that [K : Q] = 12 or 36. 

(e) Letw = Fy 2+4/3, exactly as in Exercise 7. Show that w+ w! € K. (Take 
the quotient of the two real zeros of f (x).) 

(f) Show that L > Q is a Galois extension and that Gal(L/Q) = Djp2, the 
dihedral group of order 12. (Arguments similar to those given for Exercise 
7 will do.) Note that D2 has three 2-Sylow subgroups. 

(g) Show that if K = L, then there are exactly three intermediate subfields of 
extension degree 3 over Q, viz., Q¢./2), Qc/2), and Qc? 2%. 

(h) Show that Qw + w7!) 4 QW2), QCV2), Q(C?V2). 

(i) Conclude that [K : Q] = 36 and that Gal(K /Q) = S83 x $3. 


9. Let K = Q(V2, V3, uv), where u = /@ a5 54/302 = 9/0), 


(a) Show that K D Q is a Galois extension. 

(b) Find an irreducible polynomial f(x) € Q[x] for which K is the splitting 
field over Q. 

(c) Show that Gal(K /Q) is nonabelian but is not dihedral. 
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10. 


11. 


12. 


13. 
14. 


(d) Conclude that Gal(K /Q) must be isomorphic with Qg, the quaternion group 


of order 8. 
Let F be a field of characteristic 4 2, and let x1, x2,...,X, be commut- 
ing indeterminates over F. Let K = F(x1,X2,...,Xn), the field of fractions 
of the polynomial ring F'[x1, x2,..., Xn] (or the field of rational functions in 
Hy HDs 200 Xn) 
(a) Let oj, 02,..., 0, be the elementary symmetric polynomials in x1, x2,..., 
Xn, and set E = F(o1, 02,..., On). Show that K is a splitting field over E 


for the polynomial x” — o,x"~! +.---+ (—1)"oy € E[x]. 

(b) Show that [K : E] <n!. 

(c) Noting that G = S, acts as a group of automorphisms of K in the obvious 
way, show that E = invg(K) and thereby conclude that [K : E] = n!. 

(d) Show that K is a Galois extension of E. 

(e) Set d = Hie; (x; — x;) € K and show that E(4) = inva, (K). 


Let f(x) € Q[x] be an irreducible polynomial and assume that the discriminant 
of f(x) is a perfect square in Q. 


(a) Show that if deg f(x) = 3, then f(x) must have three real zeros in the 
complex field C. 

(b) Show that if deg f(x) = 3, and if a € R is a fixed zero of f(x), show that 
the other two zeros of f(x) are polynomials in a with rational coefficients. 

(c) If deg f(x) = 4, how many real zeros can f(x) have? 

(d) Ifdeg f(x) = 4, can the Galois group have an element that acts as a 4-cycle 
on the four zeros of f(x)? 


Suppose that F is a field and that f(x) € F[x]. Suppose, moreover, that in some 
splitting field, f(x) factors as 


F(x) = (& — a1)? (4 — a2) +++ (4 — On-1), 


where @1, Q2,..., Q@,—1 are distinct. Show that a; € F. 
Let f(x) = x*—9x—9 € Q[x] and compute the Galois group of this polynomial. 
Using Swan’s formula, compute the discriminants of x? — 7x + 3, x> — 14x? — 


42 € Q[x]. 


11.12.7 Exercises for Sects. 11.9 and 11.10 


1. 


(Euler) Show that if F is a field possessing a primitive eight root of unity, then 
2 = 1+ 1 isa square in F. [Hint: Let ¢ be a primitive eighth root. Consider the 
square of €+ ¢7!,] 
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2. Let F be a field with a primitive twelfth root of unity. Show that the number 3 is 
a square in this field. [Hint: Consider the cube of ¢ + ¢~! where ¢ is a primitive 
12th root.]!° 

3. Suppose K is the splitting field for an irreducible polynomial p(x) € F[x]. 
Suppose L is an intermediate field—that is F C L C K. Show that if p(x) 
is solvable by radicals relative to the field F, then p(x) is solvable by radicals 
relative to the intermediate field L. (Remark: At first sight it seems a triviality. If 
solvability by radicals over F is seen as the ability to express the roots of p(x) 
in terms of some compound formula involving successive field operations and 
(multiple) root-extractions whose input variables belong to F’,, then the result is a 
triviality, for any such formula with input variables from F is a formula with input 
variables from the intermediate field L. But that is not exactly how we defined the 
property that “p(x) is solvable by radicals”. Our criterion was that the splitting 
field K was a radical extension of F’, that is, there is a tower of fields 


PoC Fi cC::-Ckh=K, 


such that F;,, was obtained for F; by adjoining the r;th root of an element in 
the latter field. Since L may not be one of the F;, some work is needed to show 
that K is a radical extension of L. The point of this remark is to motivate the 
following hint.) [Hint: Quoting the correct Theorems and Lemmas, observe that 
p(x) is solvable by radicals if and only the Galois group G(K/F) is solvable. 
Note that G(K /L) is a subgroup of the former, and apply Galois’ criterion.] 

4. Suppose p(x) and r(x) are two irreducible polynomials in F'[x], and let K be the 
splitting field of p(x) over F. Suppose the field K contains a root of r(x). 


(a) Show that K contains a copy of the splitting field L of r(x). 

(b) By the “Primitive Element Theorem” (Theorem 11.10.2, p. 407), L = F(@) 
for some element # whose irreducible polynomial is s(x) := Inr(@) € F[x]. 
Using the theorems of the sections cited above for this exercise set, show 
the following: 

p(x) is solvable by radicals over F if and only if (i) p(x) is solvable by 
radicals over L and (ii) s(x) is solvable by radicals over F. 


{Hint: Use the fact that ZL is a normal extension of F’,, Galois’ criterion, and an 
elementary fact about solvable groups. ] 

5. The final exercise of this section is due to Prof. Michael Artin, and uses such a 
combination of facts that it cannot easily be assigned to any one subsection of 
this chapter. On the other hand, it is not difficult if one uses the full symphony of 
facts available. A perfect test question! 


'6Tn the case that F is Z/(p), where p is a rational prime, the conclusions of these two exercises 
follow from the more extensive theory of quadratic reciprocity. A beautiful account of this, using 
Gauss sums, can be found in the book Elements of Number Theory by Ireland, K. and Rosen, M.I. 
Bogden & Quigley, New York, 1972. 
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Suppose F is a finite field of characteristic 2 and suppose K is an extension of F 
of degree 2. 


(a) Show K = F(a) for some element a whose irreducible monic polynomial 
P(x) € F[x] has the specific form 


x? +x-+ a, where, of course a belongs to the subfield F. 


(b) Show that necessarily a + 1 is also a root of the irreducible quadratic poly- 
nomial p(x) € F[x]. 
(c) Find an explicit formula for an automorphism of K that takes a toa + 1. 


11.12.8 Exercises for Sect. 11.11 


1. Let G be a finite subgroup of the general linear group GL(n, F’). We suppose 
each element g € G to be an invertible matrix (al dy, Let K be the field of quo- 
tients of the polynomial ring F'[x1,...x,]. (In other words, K = F(x1,..., Xn) 
is a transcendental extension of F' by algebraically independent transcendental 
elements x;.) 


(a) For each (transcendental) element x;, and g € G, let 


< = al) xy Se a xn. 


For each rational function f(x1,..., Xn), define 


FOS 7 Oise 


i. Show that the permutation f +> f%, forall f € K isa field automor- 
phism ¢(qg) of K. 
ii. Show that @: G — Aut(K) is an embedding of groups. 

(b) Assume G to be embedded (that is, with a little abuse of notation we write 
G for ¢(G)). Let Kg be the subfield of elements of K which are fixed by 
all elements of G. (In this particular context, the elements of KG are called 
invariants of G.) Show that G = Gal(K/Kg). 


(c) Show that any n+ | rational functions f|,..., fn41 € K areconnected by an 
algebraic equation—precisely, there is a polynomial p € F[y1,..., yn41] 
such that 


D(fi,--->fnzi) =OE K. 


[Hint: Consider the transcendence degree of the subfield generated by 


fis s+» fnti-] 
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(d) Is it true that there exists a set X of n algebraically independent invariants 
in the subfield Kg such that every element of KG is a rational function in 
the elements of X? 


11.12.9 Exercises Associated with Appendix 1 of Chap. 10 


. Prove Lemma 11.A.1. 

. Prove the uniqueness of the limit of a a convergent sequence with respect to a 
valuation ¢ of a field F. (See the paragraph following Eq. 11.38.). 

3. Prove that every convergent sequence is a Cauchy sequence. 


Noe 


Appendix 1: Fields with a Valuation 


Introduction 


Let R denote the field of real numbers and let F be any field. A mapping ¢: F > R 
from F into the field of real numbers is a valuation on F if and only if, for each a 
and @ in F: 


1. o(a) => 0, and ¢(a) = 0 if and only if a = 0. 
2. (a) = o(a)¢(), and 
3. h(a + B) < O(a) + G(P). 


This concept is a generalization of certain “distance” notions which are familiar 
to us in the case that F is the real or complex field. For example, in the case of the 
real field R, the absolute value function 


ris r ifr >0 
~ | =r ifr <0 

is easily seen to be a valuation. Similarly, for the complex numbers, the function 

“|| ||”, defined by 


lla + bil] = Va2 + B?, 


where a, b € R and the square root is non-negative, is also a valuation. In this case, 
upon representing complex numbers by points in the complex plane, ||z|| measures 
the distance of z from 0. The “subadditive property” (3.) then becomes the triangle 
inequality. 

A valuation @ : F — Ris said to be archimedean if and only if there is a natural 
number 7 such that if 
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s(n) := 1+1+---+ 1 { with exactly n summands}, 


in F, then @(s(n)) > 1. 

These valuations of the real and complex numbers are clearly archimedean. 

An example of a non-archimedian valuation is the trivial valuation for which 
(0) = 0, and ¢(a) = | for all non-zero elements a in F. 

Here is another important example. Let F be the field of quotients of some unique 
factorization domain D which contains a prime element p. Evidently, F 4 D. Then 
any nonzero element a of F can be written in the form 


where a, b € D, p does not divide either a or b, and k is an integer. The integer k 
and the element a/b € F are uniquely determined by the element a. Then, if a 4 0, 
set 


o(a) =e*, 


where e is any fixed number larger than 1, suchas 2.71. . .. Otherwise we set @(0) = 0. 
The student may verify that @ is a genuine valuation of F’. (Only the subadditive 
property really needs to be verified.) From this construction, it is clear that ¢ is 
non-archimedean—that is, it is not archimedean. 


Lemma 11.A.1 /f ¢ is a valuation of F, then the following statements hold: 


I. dd) =1. 
2. If ¢ is a root of unity in F, then b(¢) = 1. In particular, 6(—1) = 1 and so 
o(—a) = (a), for all ain F. 
3. The only valuation possible on a finite field is the trivial valuation 
4. Foranya, G3 € F, 
(a) — 4(3)| < o(a — 8). 


The proof of this lemma is left as an exercise for the student on p. 424. 


Lemma 11.A.2 [f ¢ is a non-archimedean valuation of F,, then for any a, 3 € F, 


g(a + B) < max(¢(a), o(3)). 


Proof From the subadditivity, and that fact that for any number of summands, @(1 + 
1+.---+1) < 1, we have 


g(a + B)")=¢ oom (‘) ard 3) 
< Yoo" 1616) 
< (n+ Imax (6(a)", 6(3)"). 
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Since the taking of positive nth roots in the reals is monotone, we have 


d(a+ B) < (n+ 1)'/"max (6(a), 6(8)). 


Now since n is an arbitrary positive integer, and (n + 1)!/" 


close to 1, the result follows. 


can be made arbitrarily 


The Language of Convergence 


We require a few definitions. Throughout them, ¢ is a valuation of the field F. 
First of all, a sequence is a mapping N — F, from the natural numbers to the 
field F’, but as usual it can be denoted by a displayed indexed set, for example 


{a9, a1, ..-} = {ax} or (Go, 1, .--} = {Ox}, ete. 


A sequence {ax} is said to be a Cauchy sequence if and only if, for every positive 
real number e, there exists a natural number N(e), such that 


P(Am — An) < €, 


for all natural numbers n and m larger than N(e). 

A sequence {ax} is said to converge relative to ¢ if and only if there exists an 
element a € F such that for any real €« > 0, there exists a natural number N (ce), such 
that 

O(a — AK) < €, (11.38) 


for all natural numbers k exceeding N(e). 

It is an easy exercise to show that an element a satisfying (11.38) for each e > 0 
is unique. [Note that in proving uniqueness, the choice function N : Rt > N taking 
€ to N(e) is posited for the assertion that a is a limit. To assert that ( is a limit, an 
entirely different choice function M : « — M(e) may be posited. That a = £ is the 
thrust of Exercise 2 in Sect. 11.12.9.] In that case we say that a is the limit of the 
convergent sequence {a;} or that the sequence {a} converges to a. 

A third easy exercise is to prove that every convergent sequence is a Cauchy 
sequence. (See Exercise 3 in Sect. 11.12.9.) 

A sequence which converges to 0 is called a null sequence. Thus {a,x} is a null 
sequence if and only if, for every real « > 0, however small, there exists a natural 
number N(e), such that d(az) < €, for allk > N(e). 

A field F with a valuation ¢@ is said to be complete with respect to @ if and only 
if every Cauchy sequence converges (with respect to @) to an element of F. 

Now, finally, a completion of F (with respect to @) is a field F with a valuation 
éb having these three properties: 
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1. The field F is an extension of the field F. 

2. F is complete with respect to é. 

3. Every element of F is the limit of a ¢-convergent sequence of elements of F. 
(We capture this condition with the phrase “F is dense in F”’. 


So far our glossary entails 


Convergent sequence 

Limit 

Null sequence 

Cauchy sequence 

Completeness with respect to a valuation @ 
A completion of a field with a valuation 


Completions Exist 


The stage is set. In this subsection our goal is to prove that a completion (F, ) of a 
field with a valuation (F, ¢) always exists. 

In retrospect, one may now realize that the field of real numbers R is the completion 
of the field of rational numbers Q with respect to the “absolute” valuation, | |. The 
construction of such a completion should not be construed as a “construction” of the 
real numbers. That would be entirely circular since we used the existence of the real 
numbers just to define a valuation and to deduce some of the elementary properties 
of it. 

One may observe that the set of all sequences over F possesses the structure of a 
ring, namely the direct product of countably many copies of the field F'. Thus 


S:= FN= [Teen where each Fy = F. 


Using our conventional sequence notation, addition and multiplication in S are 
defined by the equations 


{ox} + {Ox} = fax + Bx} 
{ox} + (Bx} = {ax Gk}. 


For each element a € F, the sequence {a,x} for which ag, = a, for every natural 
number k is called a constant sequence and is denoted a. The zero element of the 
ring S is clearly the constant sequence 0. The multiplicative identity of the ring S is 
the constant sequence | (where “1” denotes the multiplicative identity of F). 

Obviously, the constant sequences form a subring of S which is isomorphic to 
the field F’. It should also be clear that every constant sequence is also a Cauchy 
sequence. We shall show that the collection Ch of all Cauchy sequences also forms 
a subring of S. 
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In order to do this, we first make a simple observation. A sequence {a x} is said 
to be bounded if and only if there exists a real number b such that (ax) < b for all 
natural numbers k. Such a real number D is called an upper bound of the sequence 
{ax}. The observation is 


Lemma 11.A.3 Every Cauchy sequence is bounded. 


Proof Suppose {a,;} is a Cauchy sequence. Then there is a natural number N = N (1) 
such that (ay — p,) < 1, for all natural numbers m exceeding N. Now set 


b:= 1+ max {$(a0), (21), ..., 6(an}). 


Ifk < N, o(ax) < b, by the definition of b. If k > N then d(ax) < b by the 
definition of N and the fact that ¢(ay) < b — 1. So b is an upper bound. 


Lemma 11.A.4 The collection Ch of all Cauchy sequences of S is closed under 
addition and multiplication of sequences. Thus Ch is a subring of S, the ring of all 
sequences over F, 


Proof Recall that if {a;} is a Cauchy sequence, there must exist an auxiliary function 
N : Rt = N such that if « € R*, then d(ay, — am) < € for all numbers n and 
m larger than N(e). This auxiliary function N is not unique; it is just that at least 
one such function is available for each Cauchy sequence. If we wish to indicate an 
available auxiliary function NV, we simply write ({ax}, N) instead of {ax}. 

Now suppose ({a;}, NV) and ({;}, M) are two Cauchy sequences. For each posi- 
tive real number € set K (€) := max(N(e/2), M(e/2)). Then for any natural numbers 
n and m exceeding K (e), we have 


Pan + Bn — Am — Bm) < O(Qn — Am) + (Bn = Bm) < €/2 = €/2 = €. 


Thus ({az + 6;}, K) is a Cauchy sequence. 

Again suppose ({a;x}, N) and ({,}, M) are two Cauchy sequences. By the pre- 
ceding Lemma 11.A.3, we may assume that these sequences possess positive upper 
bounds b and c, respectively. For each positive real number € set 


K (©) := max (N(e€/2c), M(e/2b)). 
Now suppose n and m are natural numbers exceeding K (€). Then 


(Am Bm ral On Bn) _ Pam Bm = Am Bn ah Am Bn _ On Bn) 
< b(am)(O(Bm — Bn) + (Bn) O(An — Om) 
< b(e/2b) + c(e/2c) = €. 


Thus ({a;9;}, K) is a Cauchy sequence. 
Finally, it should be obvious that {—a,} is the additive inverse in S of {a,x} and, 
from an elementary property of ¢, that the former is Cauchy if and only if the latter 
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is. Finally one notices that the constant sequence | is a Cauchy sequence that is the 
multiplicative identity element of Ch. 
It follows that the Cauchy sequences form a subring of S. 


The null sequences of S are simply the sequences that converge to zero. We have 
already remarked that every convergent sequence is a Cauchy sequence. Thus the 
set Z of all null sequences lies in Ch. Note that any sequence whose coordinates are 
nonzero at only finitely many places (that is, the direct sum €@)y F) form a subset 
Zo of Z. 

Clearly the sum or difference of two null sequences is null. Suppose {a,x} is a 
null sequence so that for every positive real €, there is a natural number N(e) such 
that d(Qm) < € for all m > N(e) Next let {y,} be any Cauchy sequence with upper 
bound b. Then, for every real « > 0, andn > N(eé/b), we see that 


O(QnYn) <b. (€/2) =e. 


Thus the product {ax}{7} is a null sequence. So we have shown the following: 
Lemma 11.A.5 Z is an ideal of the ring Ch. 


A sequence {a,x} is said to possess a lower bound ¢ if and only if for some real 
£ > 0, one has ¢(a,) > & for every natural number n. The next lemma, though 
mildly technical, is important. 


Lemma 11.A.6 The following assertions hold: 


(i) Suppose a sequence {yx} in S has the property that for every real « > 0, 
0(yK) > €for only finitely many natural numbers k. Then {yx} is anull sequence. 
(ii) Suppose {x} is a Cauchy sequence which is not a null sequence. Then there 
exists a real number X > 0, and a natural number N such that $(8;) > 2 for 
allk > N. 
(iii) If {ax} is a Cauchy sequence having a non-zero lower bound £, then it is a unit 
in the ring Ch. 


Proof (i). The reader may want to do this as an exercise. The hint is that the hypothe- 
ses allow us to define a mapping K : Rt — N where 


K (e) := max {k € N| d(x) > ¢}. 


Then K serves the desired role in the definition of a null sequence. 

(ii). Using the contrapositive of the statement in part (i), we see that if {;,} is a 
Cauchy sequence which is not null, there must exist some € > 0 such that (G;) > € 
infinitely often. Also, since the sequence is Cauchy, there exists a natural number 
N(e/2) such that 


P(Bn _ Bm) < é€/2, 


for all natural numbers n and m exceeding N(€/2). So there is an natural number N 
exceeding N(€/2) such that (Gy) > €. Then form > N, 
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P(Bm) = (Gm = (8m -_ Bm) 
= o(8m) — 6(8m — Bn) 
> e— (€/2) =e/2. 


Thus \ := €/2 and N satisfy the conclusion of part (ii). 


(iii). For each € > 0, let K(€) := N(e- £7). Then for any natural numbers n and 
m exceeding K (€), we have 


1 1 1 
o(+- —) = o( ) Can — Qn) 
Qn Am AanAmn 


<0 .(?) =e. 


Thus the sequence of reciprocals {a;,~!} is a Cauchy sequence and is a multiplicative 
inverse to {a,x} in Ch. The proof is complete. 


Corollary 11.A.1 The factor ring F := Ch/Z is a field. 


Proof It suffices to show that every element of Ch\ Z is congruent to a unit modulo 
the ideal Z. Suppose {(;} is a non-null Cauchy sequence. Then by Lemma 1 1.A.6, 
there is a positive real number A and a natural number N such that ¢(3,) > for all 
natural numbers n greater than N. Now let {¢,} be a sequence for which 


ee lifk<N 
k= 10 ifk> HN. 


Then {¢, + (%} is bounded below by @ := min (1, A) and so by Lemma 11.A.6, 
part 3., is a unit. Since {¢,} belongs to Z, we have shown the sufficient condition 
announced at the beginning of this proof. 


Note that we have a natural embedding F — F taking each element a to the 
constant sequence a, all of whose terms are equal to a. 

Now we need to extend the valuation ¢ on (the embedded copy of) F to a valuation 
é of F. The key observation is that whenever we consider a Cauchy sequence {ax}, 
then {d(a,z)} is itself a Cauchy sequence of real numbers with respect to the absolute 
value. For, given real € > 0, there is a number N(e) such that ¢(a, — am) < € for 
all n and m exceeding N(e). But 


|d(Qm) — (Am)| < (An — Am) < € 


with the same conditions on n and m. 
Now Cauchy sequences {r;,} of real numbers tend to a real limit which we denote 
by the symbol 
lim rk. 
k— 00 
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Thus, for any Cauchy sequence {a,x} in Ch, we may define 
H({ox}) = lim (ax). 
k—00 


Now note that if {a,} happens to be a sequence that already converges to a 
in F, then, by the definition of convergence, limz_.55 (ax) = O(a). In short, on 
convergent sequences, é can be evaluated simply by taking ¢ of the existing limit. 
In particular this works for constant sequences, whereby we we have arrived at the 
fact that d extends the mapping ¢ defined on the embedded subfield F. 

Now observe 


Lemma 11.A.7 [f {ax} and {3} are two Cauchy sequences with the property that 


dag) < (Bx), 


for all natural numbers k larger than some number N, then 


b({ax}) < oC Ge}). 


Proof The student should be able to prove this, either by assuming the result for real 
Cauchy sequences, or from first principles by examining what it would mean for the 
conclusion to be false. 


Corollary 11.A.2 [f {ax} and {3%} are two Cauchy sequences, 


b(Lox + Bx}) < bax} + OUGe}). 


Proof Immediate from Lemma 11.A.7. 


Corollary 11.A.3 The following statements are true. 


(i) {a} is a null sequence if and only if d(fax}) =0eER. 
(ii) & assumes a constant value on each coset of Z in Ch. 


Proof Another exercise in the meaning of the definitions. 


Now for any coset y := {(@,}+ Z of Z in Ch we can let bY) be the constant value 
of ¢ on all Cauchy sequences in this coset. In effect ¢ is now defined on the factor 
ring F = Ch/Z, which, as we have noted, is a field. Moreover, from the preceding 
Corollary @ assumes the value 0 only on the zero element 0 = 0+ Z = Z of F. 

At this stage—and with a certain abuse of notation—¢ is a non- negative-real- 
valued function defined on two distinct domains, Ch and F. However the ring epi- 
morphism f : Ch > F preserves the ¢-values. That is important for three reasons. 

First it means that if d is multiplicative on Ch, then it is on F. But it is indeed the 
case that d is multiplicative on Ch, since, for {ax} and {Gx} in Ch, 
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po (aK Bx) 


dim (0x) Br) 
(Jim b(a%)) - Clim. $(54)). 


Second, the fact that the homomorphism / preserves , means that if is subad- 
ditive on Ch, then it is also subadditive on F. But the subadditivity on Ch is asserted 
in Corollary 11.A.2. 

Third, since F 1 Z = 0 € Ch (that is, the only constant sequence which is null 
is the zero constant sequence), the homomorphism f embedds F (in its transparent 
guise as the field of constant sequences) into F without any change in the valuation. 

These observations, taken together, yield 


Corollary 11.A.4 The function ¢ : F — Rt is a valuation of F, which extends the 
valuation ¢ on its subfield F. 


It remains to show that F is complete with respect to the valuation é and that F 
dense in F. 

Recall that the symbol a denotes the constant sequence {a, a, ...}. Its image 
in F is then f(@) = @ + Z. Now suppose {ax} is any Cauchy sequence, and let 
7 = {ax} + Z be its image in F. We claim that the sequence 


{f(a1), f (a2), ...} 


(of constant sequences mod Z) is a sequence of elements in F converging to ¥ relative 
to é. Choose a positive real number e¢. Then there exists a natural number NV (e) such 
that 

P(AN(-) — Am) < € forallm > N(e). 


Then as f preserves ¢, 


6(f(@no) — 7) = $ (An — {an}) 


= lim d(a =a <E. 
A560 ( N(e) n) 
So our claim is true. F' is indeed dense in F . 


Now suppose {yp} is a Cauchy sequence of elements of F with respect to @. For 
each index £, we can write 


ye = {Bex}, 


a Cauchy sequence over F indexed by k. So there exists a natural number N() 
such that 


(Bence) — Bem) < 2~* (11.39) 


for all m exceeding N(£). This allows us to define the following set of elements of F: 


de = Benes for £= 0, 1, ashes 
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Then taking limits in Eq. (11.39), we have 
(Fe) =e) < 2%, 
where, as usual, f (ox) is the image of a constant sequence, namely the coset 
{0x, Ok, Ok, ---} + Z. 


Now for any natural numbers n and m, we have 


$n — 5m) = 6 (f Gm) — f Gnd) 

= d (fn) — Ym + Ym — Yn + Yn — fGm)) 
b(F in — Ym) + OOn — Ym) + Om — f Om) 
b(n — Ym) $2" +2" 


But since {ye} is a Cauchy sequence, the above equations tell us that 6 := {dx} is a 
Cauchy sequence in Ch. It is now clear that 


b(6 — Ye) = Jim (5 — Bex) < ls 


Thus {y} converges to 5 with respect to @. It follows that F is a complete field. 
Summarizing all of the results of this subsection, one has the following 


Theorem 11.A.1 The field F = Ch/Z with valuation bisa completion of (F, @). 


The Non-archimedean p-Adic Valuations 


Our classical example of a non-archimedean valuation was derived from the choice 
of a prime p in a unique factorization domain D. The valuation ¢ is defined on the 
field F of quotients of D by the rule that ¢(r) = e—*, whenever we write r € F \{0} 
in the canonical form r = (a/b) p* where a and b are elements of D which are not 
divisible by p, and k is an integer. Of course to make this function subadditive, we 
must take e to be a fixed real number greater than 1. (Many authors choose p itself. 
That way, the values never overlap from one prime to another.) 

Now notice that from Lemma 11.A.2, the collection of all elements a of F for 
which ¢(a@) < 1 is closed under addition as well as multiplication, and so form a 
subring O of F called the valuation ring of (F, ~). From the multiplicative prop- 
erties of ¢, it is clear that every non-zero element of F either lies in O, or has its 
multiplicative inverse in O. 

A somewhat smaller set is P := {a € F|¢(a) < 1}. It is easy to see that this is 
an ideal in O. In fact the set O — P = {u € F|d(u) = 1} is closed under taking 
multiplicative inverses in F' and so consists entirely of units of O. Conversely, since 
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o(1) = 1, every unit of O must have ¢-value 1, so in fact 
O\P =U(O), (11.40) 


the (multiplicative) group of units of O. Thus P is the unique maximal ideal of O 
and so O is a local ring. 

Note that O and P have been completely determined by F and ¢. The factor 
ring O/P is a field called the residue field of F with respect to p. The same sort 
of argument used to show (11.40) can be used to show that there exists an element 
a € P, such that for any natural number k, 


pp =a U0). 


Taking the disjoint union of these set differences, one concludes that P is the principal 
ideal P = O7. 

Suppose p is a prime integer, D = Z, so F = Q, the field of rational numbers. 
Then O is the ring of all fractions ¢ where a and b are integers and p does not divide 
b. Terribly odd things happen! For example {1, p, p*,...} is a null sequence (since 
o(p*) = e-*, for all positive integers k), and 


l+p+p?t+-- 


is a “convergent series”—that is its sequence of partial sums is a Cauchy sequence. 

Let us return to our classic example. There F is the field of fractions of a unique 
factorization domain D, p is a prime element of D and ¢(r) = e~* whenr € F is 
written in the form (a/b) p*, a, b € D, k an integer. Let O be the valuation ring and 
let P be its unique prime ideal. Similarly, let F, 6, O and P be the completion of F, 
its extended valuation, valuation ring and uniqe maximal ideal, respectively. 


Theorem 11.A.2 The following statements hold: 


(i) DC O,and DN P= pD. 

(ii) D+ P=0O. 
(iii) FOO=0,ONP=P. 

(iv) OF P=O. 

(v) The residue class fields of F and F are both D/pD. 


Proof (i) Clearly every element of D has valuation at most | so the first statement 
holds. Those elements of D of value less than | are divisible by p, and vice versa. 
So the second statement holds. 

(ii) Suppose r € O\P. Then r has the form a/b where a and b are elements of 
D which are not divisible by p. Since p is a prime and D is a UFD, this means there 
are elements c and d of D such that cp + db =a. 

Then 


Cc 


r=o=(5)-pta. (11.41) 
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Now p does not divide d, otherwise it would divide a = cp + db, against our 
assumption about r. Thus d is an element of D also in O — P, and by Eq. (11.41), 
r = (a/b) modulo P. This proves (ii). 

(iii) Both statements here just follow from that fact the d extends ¢. 

(iv) Now let @ be any element in O—P so Ha) = = |. Since F is dense in F, 
there is an element r € F such that Wa —r) < 1. This means r — a is an element 
of P. Since O is an additive subgroup of F containing P, r belongs to ON F = O, 
and we can say r = @ mod P. This proves (iv). 

(v) First using (iii) and (iv), and then (i) and (ii), 


O/P =(0+ P)/P 

~ O/(ON P) = O/P, while 
O/P =(D+P)/P 

~ D/(DNP) = D/pD. 


The proof is complete. 


Now fix a system Y of coset representatives of pD in the additive group (D, +) 
and suppose 0 € Y. For example, if D = Z, then p is a rational prime, the residue 
class field is Z/pZ, and we may take Y to be the set of integers, 


(0, Oo ge 1, 


For another example, suppose D = K[x] where K is some field, and p = x +a : a 
monic polynomial of degree one. Then the residue class field is K[x]/(x +a) ~ 
But as K is a subring of D, it makes perfect sense to take Y = K (the are 
of degree zero). 

Now in the notation of Theorem 11.A.2, D+ P = (D+ P)+ P=O0O+P=0O. 
So, for a fixed & € O, there is a unique element ageY such that @ — ag € P. (If & 
is already in P, this element is 0.) Now P = pO, so 


1 = 
a, = —(@—ao) € O. 
Pp 


Then we repeat this procedure to obtain a unique element a; € Y such that a; —a, € 
pO. Then 


& = ay + pay mod p?0. 


Again, one can find az € Y such thatifa2 := (1/p)(a;—a,), thena2—a € po. So, 
after k repetitions of this procedure, we obtain a sequence of k + 1 unique elements 
of Y, (ao, a1, ..., ag) such that 


& = ay +ajptanp* +---+axp* mod P*+!, 
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Thus we can write any @ € F in the “power series” form, 
~ _ ok 2 
a= plag+aip+arzp  +---], 


where the a; are uniquely determined elements of Y, and ag € 0, so that k is 
determined by ek = ba). 

For example, if p = x € K[x] = D, then, taking Y = K, we see that F is the 
field of Laurent series over K —that is, the ring of power series of the form 


bps * eee bya + bo big + box? +s, hEN 


where the b; are in K and one follows the usual rules of adding and multiplying 
power Series. 

The ring of Laurent series is very important in the study of power series generating 
functions. Be aware that the field K here is still arbitrary. It can even be a finite field. 

Note that if D = Z and p is a rational prime, the system of coset representatives 
Y is not closed under either addition or multiplication and so, in order to restore the 
coefficients of the powers of p to their proper location in Y, some local adjustments 
are needed after addition or multiplication. For example if p = 5, then we must 
rewrite 

(1 + 3(5))(1 + 4(5)) = 1 + 7(5) + 1267) 


as 
1 + 2(5) + 3(57) + 2(53). 


Appendix 2: Finite Division Rings 


In 1905 J.H.M. Wedderburn proved that every finite division ring is in fact a field. 

On pp. 104-105 of van Der Waerden’s classic book (1930-31), (cited at at the 
end of this appendix) one finds a seemingly short proof that depends only on the 
easy exercise which proves that no finite group can be the union of the conjugates of 
a proper subgroup. But the proof is short only in the sense that the number of words 
required to render it is not large. Measured from actual elementary first principles, 
the proof spans a considerable “logical distance”. It uses the notion of a splitting 
field for a division ring, and a certain structure theorem involving a tensor product 
of algebras. Both of these notions are beyond what has been developed in this and 
the preceding chapters. 

Several other more elementary proofs are available. The first proof given below 
is due to T.J. Kaczynski (cited at the end of this appendix) and has several points of 
interest. 

We begin with a small lemma. 
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Lemma 11.A.8 Every element in a finite field GF(q) is the sum of two squares. As 
a consequence, when p is a prime number, there exist numbers t and r such that 
rtre-=-l, mod p, witht # 0. 


Proof If q is even, all elements are squares and hence is the sum of 07 and a square. 
So suppose g is odd and let S be the set of non-zero squares. Then, for any non- 
square g € GF(q), we have a partition GF(g)* = S+ Sg. Now using 0 as one of the 
summand squares, we see that every element of {0}U S is the sum of two squares, and 
if (S+S) gS 4 J (where now “S + S” indicates all possible sums of two squares), 
then this is true of gS as well and the result is proved. Otherwise S + S C {0} US 
and so {0} U S is a subfield of GF(q). That is impossible since |{0} US| = (¢ + 1)/2 
is not a prime power dividing q. 

In particular, —1 can be represented as the sum of two squares in Z/(p), and one 
of them at least (say ¢) is non-zero. 


Theorem 11.4.3 (Wedderburn) Every finite division ring is a field. 


Proof ({1]) Let D be a finite division ring and let P be the prime subfield in its 
center. Also let D* be the multiplicative group of non-zero elements of D. For some 
prime r let S be an r-Sylow subgroup of D*, of order r“, say. Now S cannot contain 
an abelian subgroup A of type Z, x Z, otherwise the subfield generated by P UA 
would have too many roots of x” — 1 = 0. Thus S is either cyclic or else r = 2 
and S is a generalized quaternion group. We shall show that the latter case is also 
forbidden. 

Suppose S is generalized quaternion. Then the characteristic p is not 2 and S$ 
contains a subgroup So which is quaternion of order 8, generated by two elements 
a and b or order 4 for which bab = a7! and z = a? = D’ is its central involution. 
Then the extension P(z) is a subfield containing the two roots of x* — 1 = 0, namely 
1 and —1. It follows that 

a=-l=z=)0" 
so a> = —a. Thus 
ba = a~'b = —ab. 


Now let p = | P|, the characteristic of D. Then by the lemma we can find elements 
t andr in P (and hence central in D) such that ¢2 + r2 = z = —1. Then 


((ta+b) +r) ((tat+b) —r) = (tat by - Pr? 
= t?a? +r(ab + ba) + b* —r? 
=—-17+b?-r* =0. 


So one of the factors on the left is zero. But then b = —ta +r andsoast 4 0,b 
commutes with a, a contradiction. 

Thus, regardless of the parity of p, every r-Sylow subgroup of D* must be cyclic. 
If follows that D* must be solvable. Now let Z* = Z(D*) be the center of the group 
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D* and suppose Z* 4 D*. Then we can find a normal subgroup A of D* containing 
Z* such that A/Z* is a minimal normal subgroup of D*/Z*. Since the latter is 
solvable with all Sylow subgroups cyclic, A/Z* is cyclic. Now Z* is in the center 
of A, and since no group has a non-trivial cyclic factor over its center, A is abelian. 

Now we shall show that in general, D* cannot contain a normal abelian subgroup 
A which is not in the center Z*. Without loss of generality we can take (as above) 
Z* < A <i D*. Choose y € A — Z*. Then there exists an element x in D* which 
fails to commute with y. Then also x + 1 is notin A as it fails to commute with y as 
well. Then, as A is normal, 


(d+x)y=a(l+x), 

for some element a € A. Then 
y-a=ax—-xy= (a — xyx7!)x. (11.42) 
Now if y — a = 0, thena = y anda = xyx! by (11.42), so y = xyx7! against 
the fact that x and y do not commute. Thus (a — xyx~') is invertible, and from 

(11.42), 

4=(G@=ayx ‘YW =a). (11.43) 
But a, y and xyx~! all belong to A and so commute with y. Thus every factor 


on the right side of (11.43) commutes with y, and so x commutes with y, the same 
contradiction as before. 


Remark As noted, the last part of Kaczynski’s proof did not use finiteness at all. As 
a result, one can infer 


Corollary 11.A.5 (Kaczynski) Suppose D is any division ring with center Z. Then 
any abelian normal subgroup of D* lies in Z. 


One might prefer a proof that did not use the fact that a finite group with all Sylow 
subgroups cyclic is solvable. This is fairly elementary to a finite group theorist (one 
uses a transfer argument on the smallest prime and applies induction) but it would 
not be as transparent to the average man or woman on the street. 

Another line of argument due to I.N. Herstein (cited below) develops some truly 
beautiful arguments from first principles. Consider: 


Lemma 11.A.9 Suppose D is a division ring of characteristic p, a prime, and let 
P be its prime subfield, which, of course, is contained in the center Z. Suppose a is 
any element of D — Z which is algebraic over P. Then there exists an element y € D 
such that 


yay'=a' #a 


for some integer i. 
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Proof Fix element a as described. The commutator mapping 6 : D — D, which 
takes an arbitrary element x to xa — ax, is clearly an endomorphism of the additive 
group (D, +). Also, for any element of the finite subfield P (a) obtained by adjoin- 
ing a to P, the fact that \ commutes with a yields 


O(Ax) = Axa — aAx = X(xa — ax) = AO(x). 


Thus, regarding D as a left vector space over the field P(a), we see that 6 is a linear 
transformation of V. 
Now, for any x € D, and positive integer k, an easy computation reveals 


o*(x) = Do)! ) alxak—J, (11.44) 


Now P(a) isa finite field of order p”, say. Then a?” = a. Since D has characteristic 
Dp, putting k = p” in (11.44) gives 


6?" (x) = xa?" — aP"x = xa —ax = 6(x). 


Thus the linear transformation 6 of V satisfies 


m 


bP = 0. 


Thus, in the commutative subalgebra of Hom (V, V) generated by 6, we have 


m 


eae) ae = = 
0=6 5= Tr? AQ), 


where the A/ are the scalar transformations V > V, A € P(a). 

Now ker 6 is the subalgebra Cp(a) of all elements of D which commute with 
a. (This is a proper subspace of V since a is not in the center Z.) Thus 6 acts as a 
non-singular transformation of W := V/Cp(a) while [[(6 — J) (the product taken 
over all non-zero \ € P(a)) vanishes on it. It follows that for some A € P(a) — {0}, 
6 — AJ is singular on W. This means there is a coset bp + Cp(a) # Cp(a) such that 


(bo) = Abo +c, c € Cp(a). 
Setting y = bo + A~!c, we have 6(y) = Ay # 0. This means 
yay! =a+X€ P(a). 


Now if m is the order of a in the multiplicative group P(a)*, then yay~! is one of 


the roots of X’” — 1 = 0 lying in P(a). But since these roots are just the powers of 
a, we have yay! =a' £a, for some i, | < m <m — 1, as desired. 
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Now we proceed with a proof of Wedderburn’s theorem. We may assume that D 
is a non-commutative finite division ring in which all proper subalgebras are fields. 
In particular, for any element v € D — Z, the centralizing subalgebra Cp(v) is a 
field. It follows that the commuting relation on D — Z is an equivalence relation. For 
each commuting equivalence class A;, and element a; € Aj, we have 


F, := ZU A; = Cpa). (11.45) 


Let D* be the multiplicative group of non-zero elements of D, so Z* := Z — {0} 
is its center. Choose a coset aZ* so that its order in D*/Z™* is the smallest prime s 
dividing the order of D* /Z*. The actual representative element a can then be chosen 
to have s-power order. 

Now by Herstein’s lemma, there is an element y € D such that 


yay |=a' £a. 


From (11.45) it follows that conjugation by y leaves the subfield F := Cp(a) 
invariant, and induces on it an automorphism whose fixed subfield is 


Co(y) NCp@) = Z. 


But the same conclusion must hold when y is replaced by a power y* which is not in 
Z. It follows that Gal(F'/Z) is cyclic of prime order r and conjugation by y induces 
an automorphism of order r. Thus y’ € Cp(a)M Cp(a) = Z, so r is the order 
of yZ* in the group D*/Z*. We can then choose the representative y to be in the 
r-Sylow subgroup of Z(y)*, so that it has r-power order. 

Now, by the “(very) little Fermat Theorem” i*~! = 1 mod s. Thus 


«s—l 
ge Vay BY =) 


=amod Z*. 


1 


So, setting b := y*—*, we have 


bab~! = az, 


for some element z in Z which is also a power of a. Now from this equation, z = 1 
if and only if b € Cp(y) NCp(a) = Z. 

Case 1: z = 1 and b is in the center. Then y*~* € Z, sor, the order of yZ* in 
D*/Z*, divides s — 1. That is impossible since s was chosen to be the smallest prime 
divisor of |D*/Z*|. 

Case 2: z € 1 and b has order r mod Z*. Then b’a~" = a = az’. But since z 
is an element of s-power order, we now have that r = s and is the order of z. Now 
at this stage, the multiplicative subgroup H := (a,b) of D* generated by a and 
b is a nilpotent group generated by two elements of r-power order, and so is an r 
group. Since it is non-abelian without a subgroup of type Z, x Z,, it is generalized 


1 
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quaternion and so contains the quaternion subgroup of order 8. Herstein goes on to 
eliminate the presence of Qg by an argument that essentially anticipates by three 
years the argument given in the first part of Kaczynski’s proof above.!” 

There is another fairly standard proof which comes closer to being a proof from 
elementary principles. It uses an argument due to Witt (cited below) concerning 
cyclotomic polynomials and the partition of the group D* into conjugacy classes. 
One may consult Jacobson’s book (cited below), pp. 43 1—2, for a presentation of this 
proof. 


Reference 


1. Kaczynski TJ (1964) Another proof of Wedderburn’s theorem. Am Math Mon 71:652-653 


'7 Actually Herstein inadvertently omitted Case | by taking s = r froma misreading of the definition 
of s. As we see, this presents no real problem since his lemma applies when s is the smallest possible 
prime. 


Chapter 12 
Semiprime Rings 


Abstract As was the case with groups, a ring is said to be simple if it has no proper 
homomorphic images (equivalently, it has no proper 2-sided ideals). On the other 
hand, a right (or left) R-module without proper homomorphic images is said to be 
irreducible. A right (or left) module is said to be completely reducible is it is a direct 
sum of irreducible modules. Similarly, a ring R is said to be completely reducible if 
and only if the right module Rp is completely reducible. A ring is semiprimitive if and 
only if the intersection of its maximal ideals is zero, and is semiprime if and only if 
the intersection of all its prime ideals is the zero ideal. Written in the presented order, 
each of these three properties of rings implies its successor—that is, the properties 
become weaker. The goal here is to prove the Artin- Wedderburn theorem, basically 
the following two statements: (1) A ring is completely reducible if and only if it is 
a direct sum of finitely many full matrix algebras, each summand defined over its 
own division ring. (2) If R is semiprimitive and Artinian (i.e. it has the descending 
chain condition on right ideals) then the same conclusion holds. A corollary is that 
any completely reducible simple ring is a full matrix algebra. 


12.1 Introduction 


In this chapter we develop the material necessary to reach the famous Artin- 
Wedderburn theorem which—in the version given here—classifies the Socle of a 
prime ring. The reader should probably be reminded here that all rings in this chapter 
have a multiplicative identity element. 

We begin with the important concept of complete reducibility of modules. When 
transferred to rings, this notion and its equivalent incarnations give us quick access 
to the Socle of a ring and its homogeneous components. Most of the remaining 
development takes place in a general class of rings called the semiprime rings. It has 
several important subspecies, the prime, primitive, and completely reducible rings, 
where the important role of idempotents unfolds. 

We remind the reader that R-module homomorphisms are applied on the eft. 
Thus if M and M’ are right R-modules and if ¢ : M — M’ is an R-module 
homomorphism, then one has 
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g(mr) = (om)r, for allm € M andallr € R. (12.1) 


Next, if we set EF = Endr(M) = Homr(M, M), the endomorphism ring of M, then 
(12.1) shows that M has the structure of an (E, R)-bimodule, a formalism that will 
be useful in the sequel. 


12.2 Complete Reducibility 


Given a right R-module M, the concept of minimal or submodule refers to minimal 
or maximal elements in the poset of all proper submodules of M. Thus 0 is not a 
minimal submodule and M itself is not a maximal submodule of M. 

One defines the radical rad(M) of a right R-module M as the intersection of all 
maximal submodules of M. If there are no maximal submodules—as in the Z-module 


Lip? ys fpAle is<-) 


—then rad(M) = M, by definition. 

There is a dual notion: One defines the socle Soc(M) as the submodule generated 
by all minimal submodules (that is, all irreducible submodules) of M. Similarly, if 
there are no irreducible submodules—as in the case of the free Z-module Zz—then 
by convention one sets Soc(M) = 0, the zero submodule. 


Examples If D is any principle ideal domain (PID), and if p is a prime element in D, 
then the D module D/p*D has radical rad(M) = Soc(M) = pD/p*D ~ D/pD. 
(See Exercise (1) in Sect. 12.6.1.) In a similar vein, if F is a field, and if R is the 
matrix ring 


consisting of lower-triangular matrices over F,, and if V is the right R-module con- 
sisting of | x 2 row vectors with entries in F: 


V= {[x y] lx,y € F}, 


it also follows that 
rad V = Soc(V) = {[x 0] |x € F}. 


(See Exercise (4) in Sect. 12.6.1.) 


Theorem 12.2.1 Let M be a right R-module having at least one irreducible sub- 
module, so that Soc(M) 4 0. Then 
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(i) Soc(M) = @ics Mi, where {Mi}icy is a family of irreducible submodules of M, 
and the sum is direct. 

(ii) In addition, Soc(M) is invariant under the endomorphism ring, E = Homr 
(M, M). 


Proof Note that the second statement (ii) is obvious, for if L C M is any irreducible 
R-module and ¢ is an R-module endomorphism, then the restriction of ¢ to L will 
have as kernel either 0 or all of L (as L is irreducible). Therefore, either 6(L) = 0 
or &(L) =p L. In either case, 6(L) C Soc(M). 

We now prove (1). Let the set J index the family of all irreducible modules of M. 
Trivially, we may assume / is non-empty. We shall call a subset J of I a direct set 
if and only if DijierM i = ®jcsM; is a direct sum. (Recall that this means that if a 
finite sum Wek tk = 0 where a, € M; and K C J, then each a, = 0. We note also 
that singleton subsets of J are direct, and so, as irreducible modules are assumed to 
exist, direct subsets of J really exist.) 

Suppose 

Ib ch (aos Gere 


is a properly ascending tower of direct sets, and let J be the union of the sets in this 
tower. If J were not direct, there would be a finite subset K C J together with a sum 
DVrex ak = 0 with not all az equal to zero. But since K is a finite subset, it clearly 
must be contained in J, for some index n, which then says that /,, cannot have been 
direct, a contradiction. This says that in the poset of all direct subsets of J, partially 
ordered by inclusion, all simply ordered chains possess an upper bound. Thus by 
Zorn’s Lemma, this poset contains a maximal member, say H. 

We now claim that My := @jeHM; = Soc(M),. Clearly, My < Soc(M). Sup- 
pose Mj, were properly contained in Soc(M). Then there would exist an irreducible 
submodule Mo not in My. This forces My 1 Mo = 0 from which we conclude that 
My + Mo = My @ Mo, and so H U {0} is a direct subset if 7, properly containing 
H, contrary to the maximality of H. Thus Soc(M) = GjeH Mj, and the proof is 
complete. 


Corollary 12.2.2 The following conditions on an R-module M are equivalent: 


(i) M = Soc(M),. 
(ii) M is a sum of minimal submodules. 
(iii) M is isomorphic to a direct sum of irreducible submodules. 


An R-module M satisfying any of the above three conditions is said to be com- 
pletely reducible. 

Next, call a submodule L of a right R-module M large (some authors use the 
term essential) if it has a non-zero intersection with every non-zero submodule. 
(Note that if M # 0, M is a large submodule of itself.) We can produce (but not 
actually construct) many large submodules using the following Lemma. 


Lemma 12.2.3 Jf B is a submodule of M and C is maximal among submodules of 
M meeting B at zero, then B + C is large. 
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Proof Assume B and C chosen as above and let D # 0 be any non-trivial submodule 
of M. Suppose (B + C)M D = 0. If b, c, and d are elements of B, C and D, 
respectively, and b = c+d, then d = b—c and this is zero by our supposition. Thus 
b=ceBNC=0,sob=c=0. Thus BN (C+ D) = 0. By maximality of C, 
DCC. Then D = (B+C)M D = 0, acontradiction. Thus always (B + C)M D is 
non-zero, and so B + C is large. 


That such a submodule C exists which is maximal with respect to meeting B 
trivially, as in Lemma 12.2.3, is an easy application of Zorn’s Lemma. 

If M is aright R-module, let L(M) be the lattice of all submodules of M (sum is 
the lattice “join”, intersection is the lattice “meet”’). We say L(M) is complemented 
(or that M is semisimple) if every submodule B of M has a complement in L(M)— 
i.e. there exists a submodule B’ such that BN B’ = 0 and B + B’ = M (this means 
M=B®B’). 


Lemma 12.2.4 If L(M) is complemented, then so is L(N) for any submodule 
N of M. 


Proof Let C be a submodule of N. By hypothesis there is a complement C’ to C in 
L(M). Then setting C” := C’N N, we see 


coc’ ecnc =v, and 
C+C”=C+(C'NN) =(C+C)NN, (the modular law) 
=MNN=N. 


Thus C” is a complement of C in N. 


Remark Of course the above is actually a lemma about any modular lattice: if it is 
complemented, then so are any of its intervals above zero. 


Lemma 12.2.5 If M is a right R-module for which L(M) is complemented, then 
rad(M) = 0. 


Proof Letmbeanon-zero element of M and let A be maximal among the submodules 
which do not contain m. (A exists by Zorn’s Lemma.) Suppose now A < B < M 
so A is not maximal in L(M). Let B’ be a complement to B in L(M). Then by the 
modular law (Dedekind’s Lemma) 


A=(BNB')+A=BN(B'+A). 


Now as B < M implies B’ 4 0 and A < B implies B’N A = 0, we see that A is 
properly contained in B’ + A. Then the maximality of A implies m € B’ + A and 
also m € B. Thus m lies in B M(B’ + A) = A, contrary to the choice of A. 

Thus no such B exists and A is in fact a maximal submodule of M. Then we see 
that every non-zero module element is avoided by some maximal submodule of M, 
and hence is outside rad(). It follows that rad(M) = 0. 
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Theorem 12.2.6 The following conditions on an R-module M are equivalent: 


(i) M is completely reducible. 
(ii) M contains no proper large submodule. 
(iii) M is semisimple. 


Proof Assume condition (i), i.e., that M = Soc(M) and that A C M is a proper sub- 
module of M. If A contained every irreducible submodule of M, then it would clearly 
contain Soc(M) = M. But then A couldn’t be proper, contrary to our assumption on 
A. Thus there is an irreducible submodule U not in A, and so AN U = 0, since U 
is minimal. Thus A is not large, proving condition (ii). 

Next, assume that (ii) holds, i.e., M contains no proper large submodules, and let 
A C M be a submodule. Using Zorn’s lemma, there exists a submodule B C M 
which is maximal with respect to the condition AM B = 0. But then Lemma 12.2.3 
asserts that A+ B is large and so by part (ii), A+B = M.This means that MV = A@B, 
i.e., that B is acomplement to A in M, proving that M is semisimple. 

Finally, assume condition (iii). Since M is semisimple, Soc(M) has a comple- 
ment C in L(M). Next, by Lemma 12.2.5, we know that rad(C) = 0 and so the 
set of maximal submodules of C must be nonempty (else C = rad(C) would be 
the intersection of the empty collection of maximal submodules). Letting L C C 
be a maximal submodule, we now apply Lemma 12.2.4 to infer the existence of a 
submodule A C C with C = A @ L. But then, A = C/L and so A is an irreducible 
R-module with AM Soc(M) = 0. This is clearly impossible unless C = 0, in which 
case Soc(M) = M, proving (i). 


12.3 Homogeneous Components 


12.3.1 The Action of the R-Endomorphism Ring 


Let M be aright R-module and let E = Homr(M, M) be its R-endomorphism ring. 
As remarked earlier, we may regard M as a left E-module, giving M the structure 
of an (E, R)-bimodule. 

Let M be any right R-module, and let F be a family of irreducible R-modules. 
We set M[F] = > A, where the sum is taken over all irreducible submodules 
A C M such that A is R-isomorphic to at least one module in F. Note that if M 
contains no submodules isomorphic with any member of F, then M[F] = 0. If 
F = {N}, the set consisting of a single irreducible right R-module N, then we write 
M[N] := M[{N}], and call it the N-homogeneous component of M; again, it is 
entirely possible that M[N] = 0. Note that by Corollary 12.2.2, M[F] is completely 
reducible for any family F of irreducible submodules of M that we may choose. 
Next, suppose that F is a family of irreducible submodules of M, and that B C M 
is a submodule of M isomorphic with some irreducible submodule A € ¥. Then for 
any e € E = Homr(M, M) we have that eB = 0 oreB =r=Rp A, proving that also 
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eB C M[A] C M[F]. This proves already that M[A] (as well as M[F]) is invariant 
under the left action of every element of E. 

We can internalize this discussion to the ring R itself. In particular, if M = Rp, the 
regular right R-module R, andif A C R is a minimal right ideal, then R[A] = M[A] 
is obviously also a right ideal of R. But it’s also a left ideal since left multiplication 
by an element r € R gives, by the associative multiplication in R, an R-module 
endomorphism R — R. By what was noted above, this proves that r R[A] C R[A], 
and so R[A] is a left ideal, as well. We shall capitalize on this simple observation 
shortly. 


Lemma 12.3.1 Assume that M is a completely reducible R-module and set E = 
Hompr(M, M). If A C M is any irreducible submodule of M, then EA = M[A\. 


Proof We have already observed above that M[A] is invariant under every R-module 
endomorphism of M, so EA C EM[A] € M[A]. Conversely, assume that A’ C M 
is a submodule with A =r A’. Since rad(M) = O we may find an irreducible R- 
submodule L C M with AN L = 0. Therefore it follows that M = A © L. If 
a:M=A®L = Ais the projection onto the first coordinate, and if 6: A > A’ 
is an R-module isomorphism, then the composite 


MASA OM 


defines an R-module endomorphism carrying A to A’. 


12.3.2 The Socle of a Module Is a Direct Sum 
of the Homogeneous Components 


Our first lemma below informs us that inside the modules M[F] we won’t encounter 
any unexpected submodules. 


Lemma 12.3.2 If M is an R-module and if F is a family of irreducible R-modules, 
then any irreducible submodule of M|F | is isomorphic with some member of F. 


Proof Set N = M[F] C M, and let A C N be an irreducible submodule. By 
Lemma 12.2.5 we have rad(N) = 0 from which we infer the existence of a maximal 
submodule L C N with AN L = 0. Since N is generated by submodules each 
of which is isomorphic with a member of 7, we may choose one, call it B, where 
BOL =O. Since L is maximal, we have N = A®L = B@ L. Therefore, 


AZ(A@L)/L=N/L=(B@L)/L=B, 


proving the result. 
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As aconsequence of the above, we may exhibit the socle of a module as the direct 
sum of its homogeneous components, that is, the direct sum of its A-homogeneous 
components M[A] as A ranges over a set of representatives of the R-isomorphisms 
classes of irreducible submodules of M. 


Corollary 12.3.3 Soc(M) is the direct sum of the homogeneous components of M. 


Proof Let F be a family of pairwise non-isomorphic irreducible R-submodules 
of M such that each irreducible submodule of M is isomorphic with a member 
of F. Obviously Soc(M) = M[F] = >) M[A]. By Lemma 12.3.2 we see that 
AcF 
for all A € F, one has M[A]M >) M[B] = 0, where F4 = F\{A}. Now by 
BeFa 
Theorem 8.1.6 on internal direct sums, it follows immediately that 


Soc(M) = » M[A] = ‘ap M[A] 
AcF AEF 


as claimed. 


The next several sections concern rings rather than modules. But we can already 
apply a few of the results of this section to the module Re. Of course, the first 
statement below has already been observed above. 


Corollary 12.3.4 Every homogeneous component of Soc(Rr) is a 2-sided ideal of 
R. Also Soc(Rp) is itself a 2-sided ideal of R (which we write as Soc(R)). 


12.4 Semiprime Rings 


12.4.1 Introduction 


We now pass to rings. The basic class of rings that we shall deal with are the semiprime 
rings and certain subspecies of these. In order to state the defining properties, a few 
reminders about 2-sided ideals are in order. 

If A and B are 2-sided ideals of the ring R, the product AB of ideals is the additive 
group generated by the set of all products {ab |(a, b) € A x B}. This additive group 
is itself already a 2-sided ideal which the ring-theorists denote by the symbol AB. Of 
course it lies in the ideal A B. (From this definition, taking products among ideals 
is associative.) 

We say that a (2-sided) ideal A is nilpotent if and only if there exists some positive 
integer n such that 


A" := A-A---A (n factors) = 0 := {Op}, 


the zero ideal. 
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An ideal P is said to be a prime ideal if and only if, whenever A and B are ideals 
such that AB C P, then either A C P or B C P. The intersection of all prime ideals 
of the ring R is called the prime radical of R and is denoted Prad(R). We say that the 
ring R is semiprime if the intersection of its prime ideals is 0, i.e., if Prad(R) = 0. 
Clearly Prad(R) is an ideal of R. A moment’s thought should be enough to reveal 
that R/Prad(R) is semiprime. If the 0-ideal of R is itself a prime ideal, the ring R is 
called a prime ring. 


12.4.2 The Semiprime Condition 


Theorem 12.4.1 The following conditions on a ring R are equivalent: 


(i) R is semiprime. 

(ii) For any ideals A and B of R, AB = O implies AN B = 0. 
(iii) O is the only nilpotent ideal. 
(iv) For each non-zero x € R, xRx #0. 


Proof (i)=>(ii) Assume A and B are ideals for which AB = 0. Then AB C P for 
any prime ideal P so either A lies in P or B lies in P. In any event AN B C P for 
each prime ideal P. Thus A /M B is contained in the intersection of all prime ideals, 
and so, since R is semiprime, is zero. 

(1i)==> (11) Suppose, by way of contradiction, that A is a non-zero nilpotent ideal 
of R. Let k be the least positive integer for which A‘ = 0. Then A‘~!- A = 0, 
which implies by condition (ii) that AS~! C A‘—! NM A = 0, contrary to the choice 
of exponent k. 

(iii) (iv) If, for some 0 # x € R we had xRx = 0, then as x € RxR, we 
conclude that Rx R is a non-zero ideal of R. But then 


RxR-RxR = RxRxR = R-0-R = O, 


forcing Rx R to be a non-zero nilpotent ideal, against (111). 

(iv)= > (i) Let a € 0; we shall prove that there exists a prime ideal P C R with 
a ¢ P, proving that R is semiprime. Set ag := a; by (iv), ag Rag 4 0. Therefore, there 
exists a non-zero element a} € ay Rag. We may continue this process to generate a 
sequence of non-zero elements ao, a1, ..., such that aj; € aj Raj. Set T = {ay,|n € 
N. Before proceeding further, we note an important property of this sequence: 


(S) If the element a; is in some ideal J C R, then so are all of its successors in T. 


Let P be an ideal maximal with respect to not containing any element of 7. (It 
exists by Zorn’s Lemma.) Now suppose A and B are ideals of R neither of which is 
contained in P but with AB C P. By the maximality of P the ideals A+ P and B+ P 
both meet T non-trivially. So there are subscripts i and j such that a; © A+ P, and 
aj € B+ P.Setm = max(i, j). Then by the property (S), a lies in both P + A 
and P + B,so 
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Am+1 © AmRdm C (A+ P)(B+ P) CAB+ PCP 


against P 1 T = &. Thus no such ideals A and B exist, so P is a prime ideal. Since 
P doesn’t contain a = ag, the proof that R is semiprime is complete. 


12.4.3 Completely Reducible and Semiprimitive Rings 
Are Species of Semiprime Rings 


At the beginning of this chapter we defined the radical of a right R-module M to 
be the intersection of all maximal submodules of M, or, if there are no maximal 
submodules, to be M itself. Similarly, we define the radical, rad(R), of the ring 
R to be rad(Rr)—the radical of the right module Rr. Accordingly, rad(R) is the 
intersection of all maximal right ideals of R and is called the Jacobson radical of R. 

While rad(R) is, a priori, a right ideal of R, the reader may be surprised to learn 
that it is also a left ideal. As this is relatively easy to show, we shall pause long enough 
to address this issue. Assume first that M is a non-zero irreducible right R-module. 
Then for each O 4 m € M, we have that (by irreducibility) M = mR. Furthermore, 
the mapping R — M,r +> mr is easily verified to be a surjective R-module 
homomorphism. This implies immediately that if J = Anng(m) := {r € R| mr= 
0}, then J is the kernel of the above R-module homomorphism: M =r R/J. Clearly 
J is a maximal right ideal of R. Next, recall that any R-module M determines a ring 
homomorphism R — Endz(M), the ring of abelian group homomorphisms of M. 
Furthermore, it is clear that Annr(M) := {r € R| Mr = 0} is the kernel of the 
above ring homomorphism, and hence Annr(M) is a 2-sided ideal of M. Finally, 
note that 


Anng(M)= {) Anna(m), (123) 
OAmeM 


i.e., Ann (M) is the intersection of the maximal right ideals Annr(m), m € M. Our 
proof will be complete as soon as we can show that 


rad(R) = [) Anna(R/J), 
JeEM(R) 


where M(R) denotes the collection of all maximal right ideals of R. By Eq. (12.2) we 
see that each Anna (R/J/) is an intersection of maximal right ideals of R; therefore if 
x € rad(R) then x is contained in Annr(R/J) for every maximal right ideal J C R. 
This proves that 


rad(R) C  {}) Anng(R/J). 
JEeM(R) 
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Conversely, if x € Annr(R/J), then Eq. (12.2) implies that x € J = Annr(1+ J), 
and so it is now clear that 


rad(R) > a Annr(R/J), 
JeM(R) 


proving that, as claimed rad(R) is a 2-sided ideal of R. 

We note in passing that the Jacobson radical annihilates every irreducible right 
R-module. 

A ring is said to be semiprimitive if and only its Jacobson radical is trivial. A 
ring is primitive if and only if there exists an irreducible right R-module M with 
Ann(M) = 0, i.e., if and only if R has a faithful irreducible module. It follows, of 
course, that every primitive ring is semiprimitive as is R/rad(M) for any ring R. 
Finally we say that a ring R is completely reducible if and only if the right module 
Rr is acompletely reducible module. A hierarchy of these concepts is displayed in 
the following Theorem: 


Theorem 12.4.2 The following implications hold among properties of a ring: 
completely reducible > semiprimitive => semiprime 


Proof The first implication is just Lemma 12.2.5, together with Theorem 12.2.6 
applied to the module M = Rr. 

For the second implication, suppose R is a ring which is semiprimitive but not 
semiprime. Then rad(R) = 0 and there exists a non-zero nilpotent ideal N. By 
replacing N by an appropriate power if necessary, we may assume N* = 0 # N. 
Since rad(R) = 0, there exists a maximal right ideal M not containing N. Then 
R=N-+M by maximality of M. Thus we may write | = n+m, where n € N and 
m € M. Then 

n=1-n=n?>+mn=0+mne M. 


Thus | = n+ ™m € M, which is impossible. Thus we see that N must lie in every 
maximal right ideal, so N C Rad(R) = 0. This contradicts N ¢ 0. Thus no such 
ring R exists, and so the implication holds. 


12.4.4 Idempotents and Minimal Right Ideals 
of Semiprime Rings 


The importance of semiprimness for us is that it affects the structure of the minimal 
right ideals of R. 

We pause, however, to define an idempotent in aring R to be a non-zero element e 
satisfying e* = e. Such elements play a central role in this and the following subsec- 
tion. Furthermore, the role that they play is analogous to that played by projections 
in linear algebra. 
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Lemma 12.4.3 (Brauer’s lemma) Let K be a minimal right ideal of a ring R. Then 
either 

(i) K? =0, or 

(ii) K =eR, for some idempotent e € R. 


Proof If K? # 0, then by the minimality of K, K = K* = kK for some element 
k € K. From this we infer the existence of an element e such that ke = k. Next, 
since the right annihilator, Anng (kK) = {x € K | kx = O}, is a right ideal of R lying 
properly in K, we have Annx (k) = 0. But as ke = k, 


k(e* — e) = (ke)e —ke = ke —ke =0, 


so e* — e € Annx (k) = 0, forcing e* = e. Finally,0 4 eR C K,soeR = K as K 
was minimal. 


Corollary 12.4.4 If R is semiprime, then every minimal right ideal of R has the 
form eR for some idempotent e € R. 


Proof Let K bea minimal right ideal of R. Then by Brauer’s Lemma, the conclusion 
holds or else K? = 0. But in the latter case, RK is a nilpotent 2-sided ideal containing 
K,as 

(RK)? = R(KR)K C RK’ = 0. 


This contradicts the fact that R is semiprime. 


Lemma 12.4.5 Suppose e and f are non-zero idempotent elements of a ring R. 
Then there is an additive group isomorphism Homp(eR, f R) = f Re. Moreover, if 
f =e, then this isomorphism can be chosen to be a ring isomorphism. 


Proof Consider the mapping w : fRe > Homr(eR, f R) defined by 
w( fre): est> (frejes = fres, wherer,s © R 


In other words, for each r € R, ~(fre) is left multiplication of eR by fre. This is 
clearly a homomorphism of right R-modules eR — fR. Note that ~(fre)(e) = fre, 
so the image of 7(fre) can be the zero homomorphism if and only if fre = 0. So the 
mapping ~ is injective. 

The mapping is also surjective, for if \ is a right module homomorphism eR > 
FR, then A(e) = fs, for some s € R. Then, for any r € R, 


A(er) = Me?r) = X\(e)- er = (fsejer = (fsey(er), 


so the R-homomorphism 4 is an image of the mapping w. 

Of course for elements r},72 in R, f(r, +ro2)e = f(ri +r2)e has a w-image that 
is the sum of the R-morphisms «)( frie) + w(fr2e), so w preserves addition. Thus 
w is an isomorphism f Re > Homp(eR, f R) of additive groups. 
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Now suppose e = f. Then eRe inherits a ring multiplication from R that is 
distributive with respect to addition. We must show that ~ preserves multiplication 
in this case. For two arbitrary elements ere, er2e of eRe, the w-image of their 
product is left multiplication of eR by erjeer2, which by the associative law, is the 
composition w(er;e) o w(erz2e) in HomR(eR, f R) = Endr(eR). 


Lemma 12.4.6 If R is a semiprime ring, and e* = e € R, then eR is a minimal 
right ideal if and only if eRe is a division ring. 


Proof (=>) If eR is a minimal ideal, then Endp(eR) is a division ring by Schur’s 
Lemma (Lemma 8.2.6). By the preceding Lemma 12.4.5, eRe is isomorphic to this 
division ring. 

(<=) Assume eRe is a division ring. Let us assume, by way of contradiction, that 
eR is not a minimal ideal. Then there is a right ideal NV, such that0 < N < eR. Then 
Ne © eRe N. Since a division ring has no proper right ideals, eRe M N is either 0 
or eRe. 

Suppose Ne # 0. Then there exists an element n € N such that ne ~ 0. Then 
ne, being a non-zero element of a division ring, possesses a multiplicative inverse in 
eRe, say ese. Then 

ne(ese) = nese = e. 


Since N is a right ideal in R, it follows that e lies in N. Thus eR C N, contrary to 
our choice of NV. Thus we must have Ne = 0. 

Now eNis aright ideal of R such that (eN)? = e(Ne)N = 0. Since Ris semiprime, 
it possesses no non-trivial nilpotent right ideals (Theorem 12.4.1). Thus eN = 0. But 
left multiplication by e is the identity endomorphism eR — eR, yet it annihilates the 
submodule N of eR. It follows that N = 0, also against our choice of N. 

Thus no such N, as hypothesized exists, and so eR is an irreducible R-module— 
i.e. a minimal right ideal. 


Corollary 12.4.7 Let e be an idempotent element of the semiprime ring R. Then eR 
is a minimal right ideal if and only if Re is a minimal left ideal. 


Proof In Lemma 12.4.6 a fact about a right ideal is equivalent with a fact about a ring 
eRe which possesses a complete left-right symmetry. Thus, applying this symmetry, 
eRe is a division ring if and only if the principal left ideal Re is minimal—that is, an 
irreducible R-module. 


Lemma 12.4.8 Let e and f be idempotent elements of the ring R. Then the right 
ideals eR and f R are isomorphic as R-modules if and only if there exist elements u 
and v in R such that vu = e and uv = f. 


Proof (=>) Assume there exists an isomorphism a : eR — f R. Here we exploit 
the isomorphism ~~! : Homr(eR, f R) > eRf of Lemma 12.4.5, to see that a is 
achieved by left multiplication by an element u = fae € f Re. Note that u = fue, 
since e and f are idempotent elements. Similarly, a~! is achieved by left multipli- 
cation by v = ebf € eRf. Clearly v = evf, since e and f are idempotents. The left 
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multiplications compose as aoa! = Ir, whichis represented as left-multiplication 
by e. Thus vu = e. Similarly, a~! oa = 1¢p is left multiplication of f R by uv and 
by f. Thus uv = f. 

(<=) Now assume that there exist elements u and v in R such that vu = e 
and uv = mf. Then uve = uvu = fu, so left multiplication by u effects an 
R-homomorphism eR — f R as right R modules. Similarly, left multiplication by v 
effects another R-homomorphism of right R-modules f R — eR. Now since vu = e 
and vu = f, these morphisms are inverses of each other, and so are R-isomorphisms 
of R-modules. Hence eR ~ f R. 


Now consider a homogeneous component A of the semiprime ring R. We have 
already observed that left multiplication of a minimal right ideal eR by an element 
r € R produces a module reR which, being a homomorphic image of an irreducible 
module, is either 0 or is an irreducible right module isomorphic to eR. It follows that 
RA C A, so A is a 2-sided ideal. But there is more. 

First, A, being the sum of irreducible right R-modules, is a completely reducible 
module. By Lemma 12.3.2 every irreducible right submodule in A is of the same 
isomorphism type. Because of this, Lemma 12.4.8 tells us that we can pass from 
one irreducible right submodule of A to any other by multiplying on the left by an 
appropriate element of R. Thus A is a completely reducible left R-module spanned 
by irreducible left R-modules of the same isomorphism type. These observations 
imply the following: 


Corollary 12.4.9 Let R be a semiprime ring, and let A be any homogeneous com- 
ponent of Rr. Then A is an irreducible left R-module, and so is a minimal 2-sided 
ideal of R. We also see from Lemma 12.4.6, that A is also a homogeneous component 
Of RR. 


12.5 Completely Reducible Rings 


The left-right symmetry that has emerged in the previous section can be applied to 
the characterization of completely reducible rings. 


Theorem 12.5.1 The following four statements are equivalent for any ring R. 


(i) Every right R-module is completely reducible. 
(ii) Rr is completely reducible. 
(iii) Every left R-module is completely reducible. 
(iv) RR is completely reducible. 


Proof That (i) implies (ii) is true a fortiori. Now assume (ii), and let M be a right 
R-module. We may write R = >°;.;Ai where the A; are minimal right ideals in R. 
We clearly have 
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Ma Dme= > Ymas 


meM méeM iel 


Note that the mapping A; > mAj;, a +> ma is a surjective R-module homomor- 
phism; therefore, it follows that either mA; = 0 or mA; is an irreducible submodule 
of M. By Corollary 12.2.2, we see that M is completely reducible. This proves that 
(i) and (ii) are equivalent. In an entirely similar fashion (iii) and (iv) are equivalent. 
Finally, by Corollary 12.4.7, (ii)<==>(iv), which clearly finishes the proof. 


There is an immediate application: 


Corollary 12.5.2 Let D be a division ring. Then any right vector space V over D 
is completely reducible. In particular V is a direct sum of copies of Dp. 


Proof Since D is a division ring, Dp is an irreducible module, so any right D module 
V (for that is what a right vector space is!) is completely reducible. 


Before stating the following, recall that R is a prime ring if and only if 0 is a 
prime ideal. 


Lemma 12.5.3 Let R be a prime ring, and assume that S = Soc(Rr) # 0. If eR is 
a minimal right ideal of R (where e is an idempotent element), then Endr(S) := 
Hom,(S, S) is isomorphic to the ring of all linear transformations Endere (Re) := 
Homere (Re, Re). (Note that Re is a right eRe-vector space.) 


Proof Since S = Soc(Rp) is a 2-sided ideal and is the direct sum of its homogeneous 
components, each of which is a 2-sided ideal, the prime hypothesis implies that S is 
itself a homogeneous component. Let S be written as a direct sum > e; R of minimal 
right ideals with e = e;, and fix one of these, say e = e;. Then each ejr ~ eR so, 
by Lemma 12.4.8, there exist elements uv; and v; such that vju; = e and ujvj = e;. 
Then 
UjeVvi = U;VjU; Vi = e = @j. 


Now let 6 € Homr(S, S). We define 
T(p) : Re > Re 


by the rule 
T(h)(re) = d(re) = d(re)e, forallr € R. 


Note that 7(¢@) is right eRe-linear, for if rye, rze € Re, and if es,e, esze € eRe, then, 


T(P)(rie + esje + re - es2e) = (rie - esje + 1r2e - es2e) 
= (rie - esje) + P(r2e - es2e) 
= d(rjejesje + b(r2e)es2e 
= T(P)(rieesye + T(P)(r2e)e52e. 
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Also, if ¢1, 62 € HomR(S, S$), it is clear that 


T(o1 + G2) = T(h1) + T(d2) 


and that T(¢1 o 2) = (T(¢1)) o (7(¢2)), proving that 7 is a ring homomorphism. 
Clearly, 7(¢) = 0 if and only if é(re) = 0 for allr € R. So 


(ei) = b(ujev;) = G(uje)v; = O- v; = 0. 
for all indices i. Thus ¢(S) = 0, and hence is the zero mapping. It follows that 7 is 
injective. 
We shall now show that 7 : Homr(S, S) > Homere(Re, Re)) is surjective. Let 
w € Homere (Re, Re), and define ¢ € Homr(S, S) as follows. If s € S, then since 
S = >\e;R is a direct sum, we may write s as s = >> e;r;, with unique summands 
ejr;, for suitable elements r; € R. Here we warn the reader that the elements r; are 


not necessarily unique, as the idempotents e; do not necessarily comprise a basis of 
S (that is, S need not be a free R-module). We set 


Hs) = Do bluje)uiri. 


To show that ¢ is well defined, we need only show that if e;r; =0, then w(uje)ujrj =0. 
However, as w € Homere(Re, Re), and since e € eRe, we have that 


were = Pluie: e)uiri = VUueleriri 
= P(uje)yjUuj Vir; 
= W(uje)yjeir; 
= pluje)-0=0, 


and so ¢ is well defined. Checking that ¢ is R-linear is entirely routine. 
The proof will be complete as soon as we show that 7(~) = w. Thus, let re € Re 
and write re = >) ejr; € S. Then 


T(G)(re) = (re) = P(reje 
= >) vuieuirie 
= >) vuie)uirie 
= Dd vujeevirie 
= >) vuje- evjrie) 
= > vujevjrie) 
= DY) vujvinj virie) 
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= >) vrie) 
= >) veirie) 
= > eirie) 
= (re -e) = (re), 


Where all sums on the right are over the parameter i. Comparing first and last terms, 
we see that 7(@) = w, proving that 7 is a surjective mapping. 


Before passing to our main result, we pause long enough to observe that for any 
ring R, we have R = Endr(Rp). This isomorphism is given viar +> 4,;, where, as 
usual, A;(s) = rs, forallr, s € R. Furthermore, the associative and distributive laws 
in R guarantee not only that each A, is ahomomorphism of the right R-module R but 
also that the mapping r +> 4, defines a ring homomorphism R — Endr(Rp). This 
homomorphism is injective since A, = 0 implies 0 = A-(1) = r- 1 =r. Finally, if 
@ € Endr(R), setr = (1); then for alls € R, 


Ar(s) = rs = (Is = (1-5) = G(s). 


Theorem 12.5.4 (Wedderburn-Artin) 


(i) A ring R is completely reducible if and only if it is isomorphic to a finite direct 
product of completely reducible simple rings. 

(ii) A ring R is completely reducible and simple if and only if it is the ring of all 
linear transformations of a finite dimensional vector space. 


Proof First we prove part (i). Let R be completely reducible, so R is a direct sum 
R= Q; <y Ai, where the A; are minimal right ideals. Now write 1 = > e, where 
each e; € Aj. Note that for each j € J, we have 


ej=l-ej= (x) ep = Dei: 


iel iel 


Since eje; € Aje; S Aj, we conclude that ef = e; and that wheni ¥ j, eje; = 0. 
Next, since 1 = >° e; is necessarily a finite sum, we may as well write this as 


n 
i=l 
for some n. Therefore, it follows that 


n 
REieR= > ax. (12.3) 
i=l 
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Since e; R C Aj, we infer that the sum in Eq. (12.3) is necessarily finite. Furthermore, 
since each A; is a minimal right ideal, we must have e; R = A;, and so conclude that 
R is a finite direct sum of minimal right ideals: 


R=e,R@e2aR@---PenR. (12.4) 


Next, we gather the above minimal right ideals into their constitutent homoge- 
neous components, giving rise to the direct sum 


R=C;®C28::-OCn, 


where the C1, ..., Cm are the distinct homogeneous components of R. If we write 
the identity element of R according to the above sum: 


m 
l= > Cj 
j=l 


we easily infer that c? = cj, and that cjcx = 0 whenever j 4 k. Furthermore, if 
xj © Cj, andifke {1,2,..., m}, we see that x;cx and cx; are both elements of 
Ci NC,. If 7 Ak, then Cj 1 Cy = 0 so xjcx = cgxj = 0. From this it follows that 


Xj = Xj -l=xjcj= 1-xj; =cjXx;. 


This proves that each of the 2-sided ideals C; has a multiplicative identity, viz., 
cj. That C; is a simple ring is already contained in Corollary 12.4.9. That C; is 
completely reducible follows from the definition of homogeneous component. 

Conversely, if R is a finite direct product of completely reducible rings, this finite 
direct product can be regarded as a finite direct sum, which already implies that Rr 
is completely reducible. 

We turn now to part (ii). Let R be a completely reducible and simple ring. Let eR 
be a minimal right ideal of R. By Lemma 12.5.3 together with the discussion above, 
we have 


R ~ Endr(R)  Endere(Re). 


Our proof will be complete as soon as we show that Re is a finite-dimensional 
vector space over the division ring eRe. Note first of all that the right R-module Rr, 
being a direct sum of finitely many irreducible modules as in Eq. (12.4) is certainly a 
Noetherian module. Therefore, any collection of right ideals has a maximal member. 
To show that Re is finite dimensional over eRe, it suffices to show that any collection 
of subspaces of Re has a maximal member (i.e., that Re is a Noetherian eRe-module). 
Thus, let {L,,} be a collection of eRe-subspaces of Re, and note that {L,, R} is a family 
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of right ideals of R. Since Rr is Noetherian, it has a maximal member, say L, R. 
Suppose L, is a subspace with L, > L,. Then we have L, R = L,R. Furthermore, 
as LeRe = L for any subspace L C Re, we have 


Ly = LyeRe = L, Re = L,Re = LoeRe = Lo, 


and so L, is a maximal subspace. 

Finally, we must prove that the ring of endomorphisms of a finite-dimensional 
vector space forms a simple ring. Assume that R = Endp(V) the ring of linear 
transformations of a finite-dimensional vector space V over a division ring D. In order 
to make the arguments more transparent, we shall show first that Endp(V) = M,,(D), 
the ring of m x n matrices over D, where n is the dimension of V over D. Fix an 
ordered basis (vj, v2,..., V,) and let @ € Endp(V). Write 


n 
o(vj) = Dd vay. + a Pe eenre OF 


i=1 


where aj € D. It is entirely routine to check that the mapping ¢ +> A = [aj] € 


M,,(D) is a ring isomorphism. Next, one argues that for? = 1,2,...,7, the sets 
0 0.-:-: 0 
Li = Gj| Gi2*++ Gin | | Gi1, 4i2,..., din € D¢ C My(D) 
0 0-:-: 0 


are isomorphic minimal right ideals in M,, (D). This makes all of M,,(D) a homoge- 
neous component and so is a simple ring. 


We turn, finally, to an alternative formulation of the Wedderburn-Artin theorem. 
In this version, we assume the ring R to be semiprimitive (so rad(R) = 0), as well as 
being (right) Artinian. The object shall be to prove that Rr is completely reducible. 


Theorem 12.5.5 Let R be a semiprimitive Artinian ring. Then R can be represented 
as a finite direct product of matrix rings over division rings. 


Proof It clearly suffices to prove that Rr is completely reducible. Since R is right 
Artinian, we may find a minimal right ideal J; C R. Also, since rad(R) = 0 we may 
find a maximal right ideal, say, M; with 1; Z My. Clearly 


R=)4+M,. 
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Next, using again the fact that R is right Artinian, we may extract a minimal right 
ideal Iy C Mj. Since rad(R) = 0 there is a maximal right ideal, say M2 such that 
In Z Mp. Since M; = RM M,, and since R = Ip + Mo, we may use the modular 
law to obtain 


M=ROM, 
= (h+M2)NM, 
=h+(M29M}), 


from which we conclude that 
R=h4+M,=)h+h+(M,1M)). 


We may continue in this fashion to produce a sequence M1, M2, ... of maximal right 
ideals and a sequence J), J2, ... of minimal right ideals such that 


R=h+h+---k+(M,NM20---0 Myx). 


Since R is right Artinian, the strictly decreasing sequence 
t 
M,2M,OM2.2---2(\Mi2Q--- 
i=1 


must terminate. Since R is semiprime, it must terminate at 0. This means that for 
a suitable positive integer s, we will have found finitely many minimal right ideals 
Ii, b,..., Is such that 

R=h+ht+---+];. 


The proof is complete. 


12.6 Exercises 


12.6.1 General Exercises 


1. Let D be a PID and let 0 # a € D be a non-unit. Compute the radical and the 
socle of the cyclic D-module D/aD. 
2. Let D bea PID. 


(a) Show that every nonzero submodule of the free right D-module D p is large. 
(b) Let p € D be a prime. Show that for every integer n > 1, every nonzero 
submodule of the cyclic D-module D/p” D is large. 
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Show that every nonzero Z-submodule of Z(p®) is large. 
Let F bea field and let A C M,,(F) be the ring of lower triangular n x n matrices 
over F’. Let V be the A-module of 1 x n matrices 


V a4) 47 aes octp | | hie te eat CFE 


Compute Soc(V) and rad(V) and show that they are both large submodules. 


. Let M be a right R-module. Show that the socle of M is the intersection of the 


large submodules of M. 


. Let R = M2(F), where F is a field. Show that 


is a minimal right ideal in R and hence is an irreducible R-module. Show that 
any other minimal right ideal of R is isomorphic to L as aright R-module. [Hint: 


If 
,_ {foo 
‘ ={[° aie cF|, 


then L’ =r L.) Conclude that Rr = R[L].] 


. Let F bea field and let x be indeterminate over F’. Let A be the 2 x 2 matrix 


a=[t 9] 


Make V = F2 = {[a b]| a, b € F} into an F[x]-module in the usual way and 
compute Soc(V) and rad(V). 


. Let F be a field, and let A € M,(F). Let M be the set of 1 x n matrices over 


F regarded as a right F'[x]-module in the usual way. If A is diagonalizable over 
F,, show that M is a completely reducible F[x]-module. (This strong hypothesis 
isn’t necessary. In fact all one really requires is that the minimal polynomial of 
A factor into distinct irreducible factors in F'[x].) 


. Aright R-module M is called a prime module if for any submodule N C M, 


and any ideal J C R we have that NJ = O implies that either N = 0 or that 
MI = 0. Show that the annihilator in R of a prime module is a prime ideal. 
Define a primitive ideal in the ring R to be the annihilator of an irreducible 
right R-module. Now show that we have the following hierarchy for ideals in 
the ring R: 


maximal ideal => primitive ideal => prime ideal. 


Let R and S be rings, and let p and o be antiautomorphisms of R and S, respec- 
tively. Given an (R, S)-bimodule M, define new multiplications 


SxMxR->M 
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by the rules that 
sm = m(s”) and mr = (r?)m, 


forall(s,m,r)Ee Sx MxR. 


(a) Show that with respect to these new multiplications, the additive group 
(M, +) is endowed with the structure of an (S, R)-bimodule, which we 
denote as °M?. 
Show that if Z is the ring of integers, any abelian additive group (M, +) is 
a (Z, Z)-bimodule where the action of an integer n on module element m 
(from the right or left) is to add m or —m to itself || times, according as n is 
positive or negative, and to set Om = m0 = 0. With this interpretation, any 
left R-module is an (R, Z)-bimodule, and any right S-module is already a 
(Z, S)-bimodule. 
(c) By applying part | with one of (R, p) or (S, a) set equal (Z, 17), the ring of 
integers with the identity antiautomorphism, show that any left R-module M@ 
can be converted into a right R-module M°. Similarly, any right S-module 
N can be converted into a left S-module, ° N, by the recipe of part 1. 
Suppose N is a monoid with anti-automorphism yu. Let K be a field and let 
KN denote the monoid ring—that is, a K-vector space with the elements 
of N as a basis, and all multiplications of K -linear combinations of these 
bases elements determined (via the distributive laws) by the multiplication 
table of N. Define {1 : KN — KN by applying p to each basis element of N 
in each K-linear combination, while leaving the scalars from K unaffected. 
Show that fi is an anti-automorphism of KN. 
(e) Suppose now N = G, a group. Show that every anti-automorphism o of G 
has the form g —> (g~!)°, where a is a fixed automorphism of G. 


(b 


wm 


(d 


wm 


12.6.2 Warm Up Exercises for Sects. 12.1 and 12.2 


1. An element x of a ring R is nilpotent if and only if, for some positive integer n, 
x” = 0. Show that no nilpotent element is right or left invertible. 

2. Show that if x is a nilpotent element, then | — x is a unit of R. 

3. Aright ideal A of R is said to be nilpotent if and only if, for some positive integer 
n (depending on A) A” = 0. Show that if A and B are nilpotent right ideals of 
R, then sois A+ B. 

4. (An interesting example) Let F be any field. Let F(x, y) denote the ring of all 
polynomials in two “non-commuting” variables x and y. Precisely F(x, y) is the 
monoid-ring FM where M is the free monoid generated by the two-letter alphabet 
{x, y}. (We shall see later on that this ring is the tensor algebra T(V) where V is 
a two-dimensional vector space over F’.) Let J be the 2-sided ideal generated by 
{x?, y?}, and form the factor ring R := F(x, y)/I. 
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(a) For each positive integer k, let 


(x), = xyx---(k factors ) + J, 
(y)x t= yxy--+(k factors ) + I. 


Show that the set {1, (x), ()x|0 < k € Z} is a basis for R as a vector space 
over F. Thus each element of R is a finite sum 


r=ao-l+ et Oi + Bi()is 


where J is a finite set of positive integers. 

(b) Show that the sum (x + J)R + (y + J)R of two right ideals is direct, and is 
a maximal two-sided ideal of R. 

(c) An element (x), is said to be x-palindromic if it begins and ends in x—.e., 
if k is odd. Similarly ()), is y-palindromic if and only if k is odd: Show that 
any linear combination of x-palindromic elements is nilpotent. 

(d) Let P, ( Py) denote the vector subspace spanned by all x-palindromic (y- 
palindromic) elements. Then the group of units of R contains (1 + P,) U 
(1 + Py) and hence all possible products of these elements. 

(e) Show that for any positive integer k, (x + yy = (x)k + (y)x. (Here is asum 
of two nilpotent elements which is certainly not nilpotent.) 


5. A right ideal is said to be a nil right ideal if and only all of its elements are 
nilpotent. (If the nil right ideal is 2-sided, we simply call it a nil ideal.) 


(a) For any family F of nilpotent right ideals, show that the sum >°- of all 
ideals in F is a nil ideal. 

(b) Show that if A is a nilpotent right ideal, then so is rA. 

(c) Define N(R) to be the sum of all nilpotent right ideals of R. (This invariant 
of R can be thought of as a certain kind of “radical” of R.) Show that N(R) 
is a 2-sided nil ideal. 


12.6.3 Exercises Concerning the Jacobson Radical 


Recall that Jacobson radical of a ring R (with identity) was defined to be the inter- 
section of all the maximal right ideals of R, and was denoted by the symbol rad(R). 


1. Show that an element r of the ring R is right invertible if and only if it belongs 
to no maximal right ideal. 

2. Let “1” denote the multiplicative identity element of R. Show that the following 
four conditions on an element r € R are equivalent: 


(a) r belongs to the Jacobson radical of R. 
(b) For each maximal right ideal M, 1 does not belong to M+ rR. 
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(c) For every element s in R, 1 — rs belongs to no maximal right ideal of M. 
(d) For every element s in R, 1 — rs is right invertible. 


. Show that if an element | — rt is right invertible in R, then so is | — tr. [Hint: 


By hypothesis there exists an element u such that (1 — rf)u = 1. So we have : 
u = 1+ rtu. Use this to show that the product 


Qd-md+tur) =1+tr—td+rtu)r, 


is just 1.] Prove that Rad(R) is a 2-sided ideal. [Hint: Show that if t € rad(R), 
then so is rt.] 


. Show that Rad(R) contains every nilpotent 2-sided ideal. [Hint: If false, there 


is a nilpotent 2-sided ideal A and a maximal ideal M such that A7 C M while 
R=A+M. Then 1 =a-+m wherea € Aandm e€ M. Thus | —a is on the 
one hand an element of M, and on the other hand it is a unit by Exercise (2) in 
Sect. 12.6.2.] 


12.6.4 The Jacobson Radical of Artinian Rings 


A ring whose poset of right ideals satisfies the descending chain condition is called 
an Artinian ring. The next group of exercises will lead the student to a proof that for 
any Artinian ring, the Jacobson radical is a nilpotent ideal. 


1. 


We begin with something very elementary. Suppose b is a non-zero element of 
the ring R. If b = ba, then there is a maximal right ideal not containing a—in 
particular a is not in the Jacobson radical. [Hint: Just show that | — a has no right 
inverse. | 


. Suppose R is an Artinian ring, and A is a non-zero ideal for which A* = A. Show 


that there exists a non-zero element b and an element a € A such that ba = b. 
[Hint: Consider ¥, the collection of all non-zero right ideals C such that CA # 0. 
A € F, so this family is non-empty. By the DCC there is a minimal element B in 
F. Then for some b € B,bA #0. Then as bA = bA*,bA € F. But minimality 
of B shows bA = B, and the result follows. ] 


. Show thatif R is Artinian, then rad(R) is anilpotent ideal. [Hint: J := rad(R) is an 


ideal (Exercise (3) in Sect. 12.6.3). Suppose by way of contradiction that J is not 
nilpotent. Use the DCC to show that for some positive integerk, A := J‘ = J*t!, 
soA* =A 4 (0). Use Exercise (2) in Sect. 12.6.3 to obtain a non-zero element b 
with ba = b for some a € A. Use Exercise (1) in Sect. 12.6.3 to argue that a is 
not in J. It is apparent that the last assertion is a contradiction. ] 


. Show that an Artinian ring with no zero divisors is a division ring. [No hints this 


time. Instead a challenge. ] 
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12.6.5 Quasiregularity and the Radical 


ee 99 


Given a ring R, we define a new binary operation “o” on R, by declaring 
aob:=a+b-—ab, foralla,be R. 


The next few exercises investigate invertibility properties of elements with respect to 
this operation, and use these facts to describe a new poset of right ideals for which the 
Jacobson radical is a supremum. One reward for this will be a dramatic improvement 
of the elementary result in Exercise (4) in Sect. 12.6.3: the Jacobson radical actually 
contains every nil right ideal. 


1. Show that (R, 0) isa monoid with monoid identity element being the zero element, 
O, of the ring R. 

2. An element x of R is said to be right quasiregular if it has a right inverse in 
(R, o)—that is, there exists an element r such that x or = 0. Similarly, x is 
left quasiregular if £0 x = 0 for some ¢ € R. The elements r and @ are called 
right and left quasi-inverses of x, respectively. (They need not be unique.) If x is 
both left and right quasiregular, then x is simply said to be quasiregular. You are 
asked here to verify again a fundamental property of all monoids: Show that if x 
is quasiregular, then any left quasi-inverse is equal to any right quasi-inverse, so 
there is a unique element x° such that 


x°ox=x0x° =0. 


Show also that (x°)° = x, for any x € R. 
3. This exercise has four parts, all elementary. 


(a) Show that for any elements a,b € R, 
aob=1—-(1—-a)(1—D). 


(b) Show that x is right (left) quasiregular if and only if 1 — x is right (left) 
invertible in R. From this, show that if x is quasiregular, then 1 — x has 
1 — x° for a two-sided multiplicative inverse. 

(c) Conclude that x is quasiregular if and only if 1 — x is a unit. 

(d) Prove that any nilpotent element a of R is quasiregular. [Hint: Think about 
the geometric series in powers of a.] 


4. Now we apply these notions to right ideals. Say that a right ideal is quasiregular 
if and only if each of its elements is quasiregular. Prove the following 


Lemma 12.6.1 /f every element of a right ideal K is right quasiregular, then each 
of its elements is also quasiregular—that is, K is a quasiregular right ideal. 
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[Hint: First observe that if x € K and y is a right quasi-inverse of x, then y € K 
as well. So now y has both left and right quasi-inverses which must be x (why?). 
Why does this make x quasiregular?] 


5. Next prove the following: 


Lemma 12.6.2 Suppose K is a quasiregular right ideal and M is a maximal right 
ideal. Then K C M. (Thus every quasiregular right ideal lies in the Jacobson 
radical.) 


[Hint: If K does not lie in M, R = M+ K sol =m-+k where m € M and 
k € K, Now k has a right quasi-inverse z. As above, z € K. Then 


O=1-z2z-z=(m+k)z-—z=mz+ (kz —7z) =mz—k, 


to force k € M, an absurdity. ] 


6. Show that the Jacobson radical is itself a right quasiregular ideal. [Hint: Show 
each element y € rad(R) is right quasiregular. If not, (1 — y)R lies in some 
maximal right ideal M.] We have now reached the following result: 


Lemma 12.6.3. The radical rad(R) is the unique supremum in the poset of right 
quasiregular ideals of R. 


7. Prove that every nil right ideal lies in the Jacobson radical Rad(R). [Hint: It 
suffices to show that every nil right ideal is quasiregular. There are previous 
exercises that imply this.] 

8. Are there rings R (with identity) in which some nilpotent elements do not lie in 
the Jacobson radical? Try to find a real example.) 


12.6.6 Exercises Involving Nil One-Sided Ideals in Noetherian 
Rings 


1. Show that if B is a nil right ideal, then for any element b € B, Rb is a nil left 
ideal. 

2. For any subset X of ring R, let Xt := {r € R|Xr = 0}, the right annihilator of 
X. 


(a) Show that always Xtrc X-. 

(b) Show that Xt C (RX)+. 

(c) Show that X~ is a right ideal. 

(d) Show that if X is a left ideal of R, then X+ is a 2-sided ideal. 


3. Now suppose L is a left ideal, and P is the poset of right ideals of the form y+ as 


y ranges over the non-zero elements of L (partially ordered by the containment 
relation). 
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(a) Show that R cannot be a member of P. 
(b) Show that if y+ is a maximal member of P, then (ry)+ = y+ forall r € R. 


4. Suppose R is Noetherian—that is, it has the ascending chain condition on right 
ideals. 


(a) Show that if L is a left ideal, then there exists an element u € L — {0} such 
that ut = (ru)+, for all r € L. 

(b) Suppose L is a nil left ideal and that u is chosen as in Part (a) of this exercise. 
Show that wRu = 0. Conclude from this that (RuR)* = 0, so u generates a 
nilpotent 2-sided ideal. [Hint: Note that if k is the least positive integer such 
that ué = 0, then u € (wé—!)+ = ut, so Ru € ut] 

(c) Because R is Noetherian, there exists an ideal M which is maximal among 

all nilpotent 2-sided ideals. Show that M contains every nil left ideal. [Hint: 

If L is a nil left ideal, sois L + M, and so (L + M)/M is a nil left ideal of 

the Noetherian ring R/M. Now if L + M is not contained in M, Part (b) of 

this exercise applied to R/M will yield a non-zero nilpotent 2-sided ideal 
in R/M. Why is this absurd?] 

Show that every nil right ideal lies in a nilpotent 2-sided ideal. [Hint: Let 

N be a nil right ideal and let M be the maximal nilpotent 2-sided ideal of 

part (c) of this exercise. (M exists by the Noetherian condition.) By way 

of contradiction assume n € N — M. Select an element r € R. Since N 

is a right nil ideal there exists an integer k > 2 such that (nr)* = 0. Then 

(rn)+! = r- (nr) -n = 0. Thus Rn is a left nil ideal, and so, by part (c) of 

this exercise, lies in M against our choice of ,] 

(e) In a Noetherian ring, the sum of all nilpotent right ideals, N(R),is a nil 
2-sided ideal (Exercise (5) in Sect. 12.6.2). Prove the following: 


(d 


wm 


Theorem 12.6.4 (Levitsky) In a Noetherian ring, N(R) is a nilpotent ideal and it 
contains all left and right nil ideals. 


Remarks The reader is advised that most of the theory exposed in these exercises also 
exist in “general ring theory” where (unlike this limited text) rings are not required 
to possess a multiplicative identity element. The proofs of the results without the 
assumption of an identity element—especially the results about radicals—require a 
bit of ingenuity. There are three basic ways to handle this theory without a multi- 
plicative identity. (1) Embed the general ring in a ring with identity (there is a rather 
‘minimal’ way to do this) and apply the theory for these rings.. Only certain sorts 
of statements about general rings can extracted in this way. For example, quasireg- 
ularity can be defined in any general ring. But to establish the relation connecting 
quasiregularity of x with | — x being a unit, the embedding is necessary. (2) Another 
practice is to exploit the plentitude of right ideals by paying attention only to right 
ideals possessing properties they would have had if R had an identity element. Thus 
a regular right ideal in a general ring, is an ideal J for which there is an element 
e, such that e — er € J for all R in the ring. The Jacobson radical is then defined 
to be the intersection of all such regular right ideals. With this modified definition, 
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nearly all of the previous exercises regarding the Jacobson radical can be emulated in 
this more general context. But even this is just “scratching the surface’. There is an 
extensive literature on all kinds of radicals of general rings. (3) In Artinian general 
rings, one can often find sub-(general)-rings containing an idempotent serving as a 
multiplicative identity for that subring. Then one can apply any relevant theorem 
about rings with identity to such subrings. 

Two particularly good sources extending the material in this chapter are the clas- 
sic Rings and Modules by J. Lambeck [2] and the excellent and thorough book of 
J. Dauns, Modules and Rings [1y.! 
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' As indicated by the titles of these two books, they have this subject covered “coming and going”. 


Chapter 13 
Tensor Products 


Abstract No algebra course would be complete without introducing the student to 
the language of category theory. Some properties of the objects of algebra are defined 
by their internal structure, while other properties describe how the object sits in a 
morphism-closed environment. Universal mapping properties are of the latter sort. 
Their relation to initial and terminal objects of another suitably-chosen category is 
emphasized. The tensor product in the category of right R-modules is defined in 
two ways: as a constructed object, and as a unique solution to a universal mapping 
problem. From the tensor product one derives functors which are adjoint to the “Hom” 
functors. Another feature is that tensor products can also be defined for F-algebras. 
The key facts that tensor products “distribute” over direct sums and that there is 
a uniform way to define multiple tensor products, allows one to define the tensor 
algebra. In the category of F-algebras generated by n elements, this algebra becomes 
an initial object. This graded algebra, T(V), is uniquely determined by an F-vector 
space V and has two important homomorphic offspring: the symmetric algebra, 
S(V) (modeled by polynomial rings), and the exterior algebra, E(V), (modeled 
by an algebra on poset chains). In the category of vector spaces, T, S and FE, and 
their restrictions to the homogenous summands, are all functors—that is, morphisms 
among vector spaces induce morphisms among the algebras and their components 
of fixed degree. Herein lie the basic theorems concerning multilinear forms. 


13.1 Introduction 


In this chapter we shall present the notion of tensor product. The exposition will 
emphasize many of its “categorical” properties (including its definition), as well as 
its ubiquity in multilinear algebra (and algebra in general!). In so doing, we shall 
find it convenient to weave some rather general discussions of “category theory” 
into this chapter, which will also help to explain the universal nature of many of the 
constructions that we’ve given thus far. 


© Springer International Publishing Switzerland 2015 471 
E. Shult and D. Surowski, Algebra, DOI 10.1007/978-3-319-19734-0_13 


472 13 Tensor Products 


13.2 Categories and Universal Constructions 


13.2.1 Examples of Universal Constructions 


In past chapters, we have encountered “universal mapping properties”. They always 
seem to say that given a certain situation, there exists a special collection of maps such 
that for every similar collection of maps, there is a unique collection of morphisms 
to or from that special collection. So, far we have discussed such “universal mapping 
properties” for special realms such as rings, or R-modules. But it is time to give 
this notion a concrete setting. The natural home for describing “universal mapping 
properties” in a general way is a “category”’. In this section, we shall define the notion 
of a category, which will not only unify some of the concepts that we’ve considered 
thus far, but also give meaning to the concept of a “universal construction”—that is, 
the construction of an object satisfying a “universal mapping property”. Examples 
of universal constructions given so far include 


the kernel of a group (or ring or module) homomorphism; 
the commutator subgroup [G, G] of a group G; 

the direct product of groups (rings, modules, .. .); 

the direct sum of modules; 

the free group on a set; 

the free module on a set. 


Se we Ne 


We shall encounter more examples. The reader is likely to wonder what the above 
examples have in common. The objective of this discussion is to make that clear. 

Before turning to the formal definitions, let us indicate here the “universality” 
of kernels. Indeed, let 6 : G — H be a homomorphism of groups, with kernel 
yu: K <> G where ji is the inclusion mapping.! From this, it is a trivial fact that the 
composition ¢ o 4: K — H is the trivial homomorphism. With ¢: G — H fixed, 
the “universal” property of the pair (K, 4) is as follows: Suppose that (K’, ju’) is 
another pair consisting of a group K’ and a homomorphism 1’ : K’ — G such that 
gow’ : K’ + Hisalso the trivial homomorphism. It follows from Theorem 3.4.5 that 
there exists a unique homomorphism 6 : K’ — K such that po @ = yp’: K’ > G. 
In other words, the homomorphism p’ : K’ — G satisfying the given property (viz., 
that  o p’ is the trivial homomorphism) must occur through the “‘courtesy” of the 
homomorphism j: : K — G. Perhaps a commutative diagram depicting the above 
would be helpful: 


'The student may be more accustomed to saying that K = {g € G|d(g) = 1x} is a subgroup of 
G. But these “category people” like to think in terms of morphisms and their compositions. 
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a @ 
K G H 
_ ft 
K’ 
As another example, we consider the direct sum D := @ Mg of a family 


{M.| a € A} of right R-modules (where R is a fixed ring). Thus D is aright R- 
module having, for each a € A, an injective homomorphism liq : Ma —> D. The 
universal property of the direct sum is as follows: if {(D’, y/,)| a € A} is another 
family consisting of an R-module D’ and R-module homomorphisms u/, : Ma > 
D’, then there must exist a unique homomorphism 6 : D > D’ making all of the 
relevant triangles commute, i.e., for each a € A, we have 4 0 fq = pt, Thus, in 
analogy with our first example (that of the kernel), all homomorphisms must factor 
through the “universal object.” The relevant picture is depicted below: 


La 


Ma D 
\é / 6 
D' 


13.2.2 Definition of a Category 


In order to unify the above examples as well as to clarify the “universality” of the 
given constructions, we start with a few key definitions. First of all, a category isa 
triple C = (O, Hom, o) where 


1. Oisaclass (of objects of C), 

2. Hom assigns to each pair (A, B) of objects a class Hom(A, B) (which we some- 
times write as Home(A, B) when we wish to emphasize the category C), whose 
elements are called morphisms from A to B, and 

3. There is a law of composition, 


o: Hom(B, C) x Hom(A, B) — Hom(A, C), 


defined for all objects A, B, and C for which Hom(A, B) and Hom(B, C) are non- 
empty. (The standard convention is to write composition as a binary operation— 
thus if (f, g) € Hom(B, C) x Hom(A, B), one writes f o g for o( f, g).) Com- 
position is subject to the following rules: 


(a) Where defined, the (binary) composition on morphisms is associative. 
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(b) For each object A, Hom(A, A) contains a special morphism, | ,, called 
the identity morphism at A, such that for all objects B, and for all f € 
Hom(A, B), g € Hom(B, A), 


fola=f, lpog=g. 


NOTE: When object A is a set, we often write id, for 1 4, since the identity mapping 
id, : A > A, taking each element of A to itself, is the categorical morphism 1,4 in 
that case. 

Occasionally one must speak of subcategories. If C = (O, Hom, 0) is a category, 
then any subcollection of objects, O97 C O, determines an induced subcategory 
Co := (Oo, Homo, ©), where Homg denotes those morphisms in Hom which connect 
two objects in Oo. There is also another kind of subcategory in which the collection 
of morphisms is restricted. Let C = (O, Hom, 0), as before, and let Hom, be a 
subcollection of Hom containing 1,4, for all A € O, with the property that if a 
composition of two morphisms in Hom, exists in Hom, then that composition lies 
in Hom,. Then (O, Hom, 0) is a morphism restricted subcategory. In general a 
subcategory of C is any category obtained from C by a sequence of these processes: 
forming induced subcategories and restricting morphisms. 

There are some rather obvious categories, as follows. 


1. The category Set, whose objects are all sets and whose morphisms are just 
mappings of sets. 

2. The categories Group, Ab, Ring, Field, Mod, of groups, abelian groups, 
rings, fields, and (right) R-modules and their relevant homomorphisms. Note that 
Ab is an induced subcategory of Group. 

3. The category Top of all topological spaces and continuous mappings. 

Here are a few slightly less obvious examples. 

4. Fix a group G and define the category (O, Hom, o) where O = G, and where, 
for x, y € G, Hom(x, y) = {g € G| gx = y}. (Thus Hom(x, y) is a singleton 
set, viz., {yx—!}). Here, o is defined in the obvious way and is associative by the 
associativity of multiplication in G. 

5. Let M be a fixed monoid. We form a category with {M} as its sole object. Hom 
will be the set of monoid elements of M, and composition of any two of these is 
defined to be their product under the binary monoid operation. (Conversely, note 
that in any category with object A, Hom(A, A) is always a monoid with respect 
to morphism composition.) 

6. Let G bea simple graph (V, E). Thus, V is a set (of vertices) and E (the set of 
edges) is a subset of the set of all 2-element subsets of V. If x and y are vertices of 
G, we define a walk from x to y to bea finite sequence w = (x = x0, X1,---,Xn = 
y) where each successive pair {x;, xj41} € E fori =0,...,n — 1. The graph is 
said to be connected if, for any two vertices, there is a walk beginning at one of 
the vertices and terminating at the other. For a connected graph (V, E), we define 
a category as follows: 


(a) The objects are the vertices of V. 
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(b) If x and y are vertices, then Hom(x, y) is the collection of all walks that 
begin at x and end at y. (This is often an infinite set.) 


Suppose now u = (x = Xx0,%1,.--,Xn = y) is a walk in Hom(x, y), and 
v= (vy = yo, Y1,---, ¥m = Z) iS a walk in Hom(y, z). Then the composition 
of the two morphisms is defined to be their concatenation, that is, the walk 
Vvout= (xX =X0,..-,Xn = YO, ---, Ym = Z) from x to z in Hom(x, z). 


7. If Gy := (Vi, E,) and Gz = (V2, E2) are two simple graphs, a graph homo- 
morphism @: G, — Gz is aset-mapping f : V; — V2 such that if {x, y} is an 
edge in F), then either f(x) = f(y) € Vo, orelse { f(x), f(y)} is an edge in E2. 
The category of simple graphs, denoted Graphs, has all simple graphs and their 
graph homomorphisms as objects and morphisms. [By restricting morphisms or 
objects, this category has all sorts of subcategories for which each object pos- 
sesses a unique “universal cover”. | 
We conclude with two examples of categories that are especially relevant to the 
constructions of kernel and direct sum given above. 

8. Let@: G — H beahomomorphism of groups. Define the category whose objects 
are the pairs (L, 7), where L is a group and where 7 : L — Gisahomomorphism 
such that 607 : L — H is the trivial homomorphism. If (L, 7), (L’, 7’) are two 
such pairs, a morphism from (L, 77) to (L’, 7’) is simply a group homomorphism 
6: L — L’ such that the diagram below commutes: 


Jo 


Ui] 


L’ 


: 


” 


L 


G 


It should be clear that the axioms of a category are satisfied for this example. 

9. Let {M, | a € A} be a family of right R-modules and form the category whose 
objects are of the form (M, {¢,| @ € A}), where M is a right R-module and 
where each ¢, is an R-module homomorphism from M, to M. In this case a 
morphism from (M, {dq | a € A}) to (M’, {¢/, | a € A}) is simply an R-module 
homomorphism 6 : M — M’ such that for all a € A, the following triangle 
commutes: 
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13.2.3 Initial and Terminal Objects, and Universal Mapping 
Properties 


We continue with a few more definitions. Let C = (O, Hom, o) be a category and 
let A and B be objects of C. As usual, let 14 and 1g be the identity morphisms of 
Hom(A, A)andHom(B, B). Amorphism f € Hom(A, B) iscalledan isomorphism 
if there exists a morphism g € Hom(B, A) such that go f = 1,4 and fog=1 g.In 
this case, we write f = g~! and g = f~!. An initial object in C is an object I of C 
such that for all objects A, Hom(/, A) is a singleton set. That is to say, J is an initial 
object of C if and only if, for each object A of C, there exists a unique morphism 
from I to A. Likewise, a terminal object is an object T such that for all objects A, 
Hom(A, T) is a singleton set. 
The following should be clear, but we’ll pause to give a proof anyway. 


Lemma 13.2.1 (“Abstract Nonsense’) [f the category C contains an initial object, 
then it is unique up to isomorphism in C. Similarly, if C contains a terminal object, 
it is unique up to isomorphism in C. 


Proof Let I and I’ be initial objects of C. Let f € Hom(/, 1’), f’ € Hom(/’, J) be 
the unique morphisms. Then 


flo f €Hom(/, 1) = {1}, fo f’ € Hom’, 1’) = {1}. 


Thus, if C has a terminal object, then it is unique. Essentially the same proof shows 
that if C has a terminal object, it is unique. 


At first blush, it might seem that the notion of an initial object in a category 
cannot be very interesting. Indeed, the category Group of groups and group homo- 
morphisms contains an initial object, viz., the trivial group {e}, which is arguably not 
very interesting. At the same time the trivial group is also a terminal object in Group. 
Similarly, {0} is both an initial and a terminal object in Mod. As a less trivial exam- 
ple, note that if K is the kernel of the group homomorphism ¢ : G — H, then by our 
discussions above, (K, ju) is a terminal object in the category of example 8 above. 
Likewise, (G) Ma, {ta | @ € A}) is an initial object in the category of example 


9, above. In light of these examples, we shall say that an object has a universal 
mapping property if it is an initial (or terminal) object (in a suitable category). 

In general, we cannot expect a given category to have either initial or terminal 
objects. When they do, we typically shall demonstrate this via a direct construction. 
This is certainly the case with the kernel and direct sum; once the constructions were 
given, one then proves that they satisfy the appropriate universal mapping properties, 
i.e., that they are terminal (resp. initial) objects in an appropriate category. 
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13.2.4 Opposite Categories 


For each category C = (O,Homc,o), there is an opposite category C°PP = 
(O, Hom®PP, 0’), having the same set of objects O; but we now turn the arrows 
that depict morphisms around. More precisely, f € Hom(A, B), if and only if 
foPP € Hom®??(B, A). In category C, if f €¢ Hom(A, B), g € Hom(B, C), so that 
h := go f € Hom(A, C), then in category C°PP we have h°PP = fOPP o! g°PP € 
Hom®PP(C, A). One notices that an identity mapping 14 in C becomes an identity 
mapping in C°PP and that an initial (terminal) object in C becomes a terminal (initial) 
object in C°PP. 

A universal mapping property in one category corresponds to an “opposite” map- 
ping property in its opposite category. Let us illustrate this principle with the notion 
of the kernel. Earlier we discussed the universal mapping property of the kernel in the 
category of right R-modules. Let us do it again in a much more general categorical 
context. 

Suppose C = (O, Hom, 0) is a category in which a terminal object T exists and 
that T is also the initial object —that is T = 7. Now for any two objects A and 
B in O, consider the class Hom(A, B), in category C. This class always contains 
a unique morphism which factors through J = 7. Thus there is a unique mapping 
Ta : A — T, since T is a terminal object, and a unique mapping ug : T = I > B, 
since T is also the initial object. Then O4g := tg o Ta is the unique mapping in 
Hom(A, B) which factors though J = 7. This notation, 04, is exploited in the next 
paragraphs. 

Now, suppose @: A —> B isa fixed given morphism. Then the kernel of ¢, is a 
morphism & : (ker ¢) — A such that 0 & = O¢ker ¢)B, With this universal property: 


Ify: X — Aisamorphism ofC such that oy = Oxp, then there is a morphism 
Oy : X — ker @ such that Ko 6x = ¥. 


Now in the opposite category this property becomes the defining property of the 
cokernel. Again suppose C = (O, Hom, o) is a category in which a terminal object 
T exists and that T is also the initial object 7. Fix a morphism ¢ : A > B. Then the 
cokernel (if it exists) is a morphism e€ : B — C := coker@ such that € o @ = Osc, 
with the following universal property: 


Ifé’: B- X isamorphism ofC such that yo ¢ = Oax, then there exists a unique 
morphism 6x : C — X such that 6, 0¢€ = e'. That is, the following diagram 
commutes: 
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Again, it is easy to describe the cokernel as an initial object in a suitable category 
of diagrams that depend on the morphism ¢. 


13.2.5 Functors 


Let C} = (O,, Hom), 01) and Cz = (O2, Homg, 02) be two categories. We may 
write Hom; = UfHom;(A, B),|A, B € Oj}, the full collection of morphisms of 
category C;,i = 1, 2. 

A covariant functor F :C, — C2 is a pair of mappings, 


Fo : 0, > Op, 
Fy : Hom; — Hom), 


such that 
(*) If f ¢ Hom ,(A, B) and g € Hom(B, C), then 


1. Fa(f) € Hom2(F(A), F(B)), Fa(g) € Hom2(F'(B), F(C)), and 
2. Fa(go1 f) = Fu) 02 Fu(f) € Hom2(F (A), F(C)). 


In other words: (1) Fy is compatible with Fo in the sense that if f : A — B is 
a morphism of C;, then Fy7(f) is a morphism in Hom2(F (A), F(B)), and (2) Fy 
preserves composition of mappings. 

Here is an example : Let C be the category of example 4 of p. 597, whose objects 
were the elements of a group G and whose morphisms were the maps induced by left 
multiplication by elements of G. Then any endomorphism G — G is aa covariant 
functor of this category into itself. The reader can easily devise a similar functor 
using a monoid endomorphism for example 5 of p. 597. 

Here is a classic example we have met before (see p. 494). Let M be a fixed 
right R-module. Now for any right R-module A, the set Hom,r(M, A) is an abelian 
group. If f : A — B is a morphism, composing all the elements of Hom(M, A) 
with f yields amapping F(f) : Hom(M, A) — Hom(M, B) preserving the additive 
structure. Thus (Hom, —) : Moder — Ab is a covariant functor from the category 
of right R-modules to the category of abelian groups. 

Similarly, Homr(M, —) is a covariant functor Moder — Rr Mod, from the cate- 
gory of right R-modules to the category of left R-modules (again see p. 494). 

Now suppose F is a covariant functor from category C;} = (O;, Hom, 01) to 
Co = (O2, Homg, 02) such that Fo : O; — Op and Fy : Hom; — Homzg are 
bijections. Let ie and Fy " be the corresponding inverse mappings. Then F~! = 
,; F,") preserves composition and so is a functor F~! : C2 —> C;. In this case 
the covariant functor F is called an isomorphism of categories. 

There is another kind of functor—a_ contravariant functor—which reverses the 
direction and order of compositions. Again it consists of two maps: Fo : O; > 
O2, Fy : Hom; — Homzg, except now we have: 


(1*) If f: A > Bisin Hom, then Fy(f) € Hom)(B, A). 
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(2*) If f: A— Bandg: B > CareinHom, then Fy (go f) = Fu(f)oFy(g) € 
Hom2(C, A). 


Now, fixing the right R-module M, the functor Hom(—, M) : Mr — Abisa 
contravariant functor from the category of right R-modules to the category of abelian 
groups. 

There are further versions of functors of this sort, where M is a bimodule. A 
classical example is the functor which, in the category of vector spaces, takes a vector 
space to it’s dual space. Specifically, let F be a field and let Vect be the category 
whose objects are the (F’, F)-bimodules—that is, F-vector spaces where av = va 
for any vector v and scalar a. The morphisms are the linear transformations between 
vector spaces. The dual space functor Hom; (—, F) : Vect — Vect takes a linear 
transformation T : V — W, to the mapping Hom, (W, F) — Homf(V, F) which 
takes the functional f ¢ Homf(W, F) to the functional f o T «¢ Homf(V, F). 

One final issue is that functors between categories can be composed: If F : Cj > 
C2 and G : C2 > C3 are functors, then, by composing the maps on objects and on the 
morphism classes: Geo o Fo and Gy o Fy, one obtains a functor Go F : Cy > C3. 
Obviously 


If F : C, — Cz and G : C2 > C3 are functors, then the composite functor G o F is 
covariant if and only if F and G are either both covariant or both contravariant. 


Of course you know where this is leading: to a “category of all categories”. Let’s 
call it Cat. It’s objects would be all categories (if this is even conceivable), and its 
morphisms would be the covariant functors between categories. This is a dangerous 
neighborhood, for as you see, Cat is also one of the objects of Cat. Clearly, we have 
left the realm of sets far behind. The point of this meditation is that actually we have 
left sets behind long ago: the collection of objects in many of our favorite categories 
Ab, Ring, Mod, etc. are not sets. Of course this does not preclude them from being 
collections that possess morphisms. These things are still amazingly useful since they 
provide us with a language in which we can describe universal mapping properties. 
We just have to be very careful not to inadvertently insert inappropriate axioms of 
set-theory to the collections of objects of these categories. 


13.3 The Tensor Product as an Abelian Group 


13.3.1 The Defining Mapping Property 


Let M be aright R-module, let N be a left R-module, and let A be an abelian group. 
By a balanced mapping (or R-balanced mapping if we wish to specify R), we mean 


?For a lucid superbly-written account of the difference between sets and classes, the authors rec- 
ommend the early chapters of Set Theory and the Continuum Hypothesis by Raymond Smullyan 
and Melvin Fitting [3]. 
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a function 
f:MxN—A, 


such that 


G@) fm +mz2,n) = f(mi,n) + f(mo,n), 
(ii) f(m,n, +n2) = f(m,ni) + fim, nz), 
Gii) f(mr,n) = f(m,rn) 


where m,m,,m2 € M, n,nj,noEN,reR. 

Define a tensor product of M and N to mean an abelian group T, together with 
a balanced map t : M x N — T such that given any abelian group A, and any 
balanced map f : M x N — A there exists a unique abelian group homomorphism 
o@:T — A, making the diagram below commute. 


T 


MxWN - A 


Notice that a tensor product has been defined in terms of a universal mapping 
property. Thus, by our discussions in the previous section, we should be able to 
express a tensor product of M and WN as an initial object of some category. Indeed, 
define the category having as objects ordered pairs (A, f) where A is an abelian 
group and f : M x N — A isa balanced mapping. A morphism from (A, /) to 
(A’, f’) is an abelian group homomorphism 6 : A — A’ such that the following 
diagram commutes: 


MxWN 


f f 


A A’ 

Then the preceding diagram defining the tensor product T displays its role as an 
initial object in this category. The following corollary is an immediate consequence 
of “abstract nonsense” (see Lemma 13.2.1). 


Corollary 13.3.1 The tensor product of the right R-module M and the left R-module 
N (if it exists) is unique up to abelian group isomorphism. 
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13.3.2 Existence of the Tensor Product 


Thus, there remains only the question of existence of the tensor product of the respec- 
tive right and left R-modules M and N. To this end, let F' be the free Z-module with 
basis M x N (often called “the free abelian group on the set M x N”’). Let B be the 
subgroup of F generated by the set of elements of the form 


(my + m2, n) _ (m1, n) = (m2, n), 
(m,n +n2) — (m, nj) — (m, nz), 


(mr,n) — (m, rn), 


where m,m,,m2 € M, n,nj,n2 € N, r € R. Write M @r N := F/B and set 
m@n:=(m,n)+ BEM @prN. Therefore, the defining relations in M ®p N are 
precisely 


(m, +m2)@n =m, @n+m2 @n, 
m®@(nyptn2)=mM@Ontt+tm@no, 


mr @n=m®@rn. 


for allm,m,,m2 € M, n,nj,n2EN,reR. 

The elements of the form m @n where m € M andn é€ N are called pure 
tensors (or sometimes “ simple tensors’). Furthermore, from its definition, M ®pr N 
is generated by all of its pure tensors m @n, me M,neN. 

Finally, define the mapping t : Mx N > M @prN by setting t(m,n) = 
m@n, m eM, neéN. Then, by construction, t is a balanced map. In fact: 


Theorem 13.3.2 The abelian group M @p N, together with the balanced map t : 
Mx N > M @RN, is a tensor product of M and N. 


Proof Let A be an abelian group and let f : M x N — A bea balanced mapping. If 
the abelian group homomorphism 6: M @r N — A is to exist, then it must satisfy 
the condition that for allm € M and foralln € N,O(m@n) = Oot(m,n) = f(m,n). 
Since M @p N is generated by the simple tensors m @n, m € M, n € N, it follows 
thatO: M@rN — A isalready uniquely determined. It remains, therefore, to show 
that is an abelian group homomorphism. To do this, we need only verify that the 
defining relations are satisfied. We have, since t and f are balanced, 


A((m, + m2) @n) = f (Cm, + m2), n) 
= f(mi,n) + f(m2,n) 
= Om, @n) + O(m2 @n), 


for all m,,mz € M, n € N. The remaining relations are similarly verified, proving 
that 0 is, indeed, a group homomorphism M ®r M — A, satisfying dot = f. We 
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have already noted that @ is uniquely determined by this condition, and so the proof 
is complete. O 


A few very simple applications are in order here. The first is cast as a Lemma. 
Lemma 13.3.3 If M is a right R-module, then M @r R = M as abelian groups. 


To show this, one first easily verifies that the mapping t : M x R — M defined by 
setting t((m,r) = mr, m € M, r € Ris balanced. Next, let A be an arbitrary abelian 
group and let f : M x R — A bea balanced mapping. Define 6 : M —> A by 
setting 0(m) := f(m, 1p), m € M. (Here 1p is the multiplicative identity element 
of the ring R.) Using the fact that f is balanced, we have that 0(m; + m2) = 
f(m, + mz, lr) = f(m, lr) + f(m2, Lr) = A(m1) + Amz), m1, m2 € M, and 
so 6 is ahomomorphism. Finally, let m € M, r € R; then 6 ot(m,r) = @(mr) = 
f(mr, lr) = f(m,r), and so @ot = f, proving that (M, fr) satisfies the defining 
universal mapping property for tensor products. By Corollary 13.3.1, the proof is 
complete. 


Remark Of course in the preceding proof one could define e : M > M @ R by 
setting e(m) := m®@ 1p and then observe that the mappings ¢ oe and e of are identity 
mappings of M and M ® R respectively—but that proof would not illustrate the use 
of the uniqueness of of the tensor product. Later in this chapter we shall encounter 
other proofs of an isomorphism relation which also exploit the uniqueness of an 
object defined by a universal mapping property. 


As a second application of our definitions the consider the following: 

If A is any torsion abelian group and if D is any divisible abelian group, then 
D®zA=0.Ifa € A, letO An € Zbe such that na = 0. Then for any d € D there 
exists d’ € D such that d'n = d. Therefored@a = d'n@a = d'@na=d'@0=0. 
Therefore every simple tensor in D @z A is zero; since D®z A is generated by simple 
tensors, we conclude that D @z A = 0. 

Using the simple observation in the preceding paragraph, the reader should have 
no difficulty in proving that if m and n are positive integers with greatest common 
divisor d, then 

Z/[mZ ®@z Z/nZ = Z/dZ 


(see Exercise (2) in Sect. 13.13.2). 


13.3.3, Mapping Properties of the Tensor Product 


The following is an important mapping property of the tensor product. 


Theorem 13.3.4 Let R be a ring, let 6: M — M' be aright R-module homomor- 
phism and let): N > N’ be a left R-module homomorphism. Then there exists a 
unique abelian group homomorphism 6 @w: M @r N > M' @pR N' such that for 
allméeM,neN, (¢@v)(m@n) = d(m) @ Yn). 
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Proof We start by defining the mapping ¢ x w: M x N > M’ @r N’, by the rule 
(@ x w)(m,n) := d(m) @ wW(n) forallm € M, née N. 


Using the fact that ¢, ~ are both module homomorphisms, ¢ x w is easily checked to 
be balanced. By the universality of M @ z N, there exists a unique mapping—which 
we shall denote ¢ ® y~—from M @pr N to M' @p N’ such that (¢@ w)ot = ¢x 4, 
where, asusualt : Mx N — M@pRNisthe balanced mapping t (m,n) = m@n, m € 
M,n €N. Therefore, (6 ® W)(m ®n) = (6 @wW) ot(m,n) = (6 x w)(m,n) = 
o(m) ® w(n), and we are done. 


The following is immediate from Theorem 13.3.4: 


Corollary 13.3.5 (Composition of tensored maps) Let R be a ring, let M £u5 


/ 7 
M” be a sequence of right R-module homomorphisms, and let N Sn’ S NY be 
a sequence of left R-module homomorphisms. Then 


 @V)o4G@Yp) = 0d) @W' op): MOrN > MORN". 


Corollary 13.3.6 (The Distributive Law of Tensor Products) Let >\,-; Ao be a 
direct sum of right R-modules {Ag|o € I}, and let N be a fixed left R-module. Then 
sei Az) ® N is a direct sum of its submodules {Az ® N}. 


Proof By Theorem 8.1.6 on identifying internal direct sums, it is sufficient to demon- 
strate two things: (i) that (> cet 4c) ® N is spanned by its submodules A, ® N and 
(ii) that for any index 7 € J, that 


(A,@N)N > AG@N =0. (13.1) 
OAT 


We begin with (i). Any element a € >° Aa, has the forma = Dy es a, where S is 
a finite subset of J, and a; € A;. From the elementary properties of “@” presented 
at the beginning of Sect. 13.3.2, for any n € N, we have a ®n = 5a, @n, the 
summands of which lie in submodules A; @ N. Since (>* cet 4c) @ N is spanned 
by such elements a @ n, it is also spanned by its submodules A, ® N. Thus (i) holds. 

Since the sum >"; Aq is direct, there exists a family of projection mappings 
{mei >, get 4c — A,} with the property that the restriction of 7, to the submodule 
A, is the identity mapping if o = 7, while it is the zero mapping A, — 0, whenever 
oa #T (see Sect. 8.1.8). 

Now, letting 1), denote the identity mapping on the submodule N, Theorem 13.3.4 
gives us mappings 7; @ ly : (>> Ac) ®N — A;@N whose value at a pure element 
a@nis7,(a)Q@n, for (a,n) € (> Ac) x N. Thus 7, @ 1y restricted to the submodule 
A; @ N is the identity mapping 14, ® ly on that submodule. In contrast, for 0 4 T, 
the mapping 7; @ Ly restricted to Az © N has image 0 @ N = 0. 
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Now consider anelementb € (A; ON) Yager (Ac @N). Then (7, @1y)(b) = b, 
as b € A, ® N, while on the other hand this value is 0, since, b € Dude (A, @ N). 
Thus b = 0 and so Eq. (13.1) holds. Thus (ii) has been established. 


Of course, a “mirror image” of this proof will produce the following: 


Corollary 13.3.7 If N is aright R-module and >) ,<; Ag is a direct sum of a family 
of left R-modules {Ag|o € I}, then N ® (>. ,¢) Ac) is a direct sum of its submodules 
{N ®@ Aglo € I}. Thus one can write 


N® (Ao) = QW ® Av). 


oel oel 


From Corollary 13.3.5, we see that for any ring R and for each left R-module N, 
F : Modpz — Ab given by 


F(M) :=M@RN, F(¢):=¢@idn: M@rN— M @rN 


(where ¢' : M — M' is aright R-module homomorphism and as usual idy is the 
identity mapping on NV) defines a functor. 

Similarly, Theorem 13.3.4 and Corollary 13.3.5 tell us that for each right R-module 
M the assignment G : Modr — Ab given by 


G(N) :=M@rN, F(¢):=idmM@o:M @rN > M @rN’ 


(where ¢ : N — N’ is a left R-module homomorphism) also defines a functor. 
These two functors are often denoted — @r N and M ®r —, respectively. 

We shall continue our discussion of mapping properties of the tensor product, 
with the existence of these functors in mind. 

The next theorem shows how the tensor product behaves with respect to exact 
sequences. (In the language just introduced, this result describes the right exactness 
of the tensor functors just introduced.) 


Theorem 13.3.8 (i) Let M’ % M -> M" - 0 be an exact sequence of right 
R-modules, and let N be a left R-module. Then the sequence 


M’ @rN 2S Mer n ES M" @RN->O 


is exact. 
(ii) Let N' S N > N" > Obeanexact sequence of left R-modules, and let M be 
aright R-module. Then 


nt iom® idm @p 


M @r Mp yy ® SM @r N”">0 


is exact. 
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Proof We shall be content to prove (i) as (ii) is entirely similar. First of all, if 
m” e€ M", then there exists an element m € M with e(m) = m”. Therefore, it 
follows that for any n € N, (€ ® idy)(m @ n) = (m) @ idn(n) = m” @ n. Since 
M” ®r N is generated by the simple tensors of the form m’ @ n, we infer that 
€@ idx :M @p N > M” @g Nis surjective. 

Next, since (€ @ idy)(" @ idN) =~ @ idyn = 0: M’ @p N > M” @eN, we infer 
that im ({4@idy) C ker (€ @idn). Next, let A be the subgroup of M @p N generated 
by simple tensors of the form m @ n, where m ¢€ kere. Note that it is clear that 
A C ker (€ ® idy). As a result, € @ idy factors through a homomorphism 


€@idn: (M@rN)/A> M" @RN 
satisfying 


€@idy(m@n+A)=e(m)@ne M" @rN,meM,neN. 


If we can show that € © idy is an isomorphism, we will have succeeded in showing 
that A = ker € ® idn. To do this we show that € ® idy has an inverse. Indeed. define 
the mapping 


f:M"xN—> (M@RN)/A, f(m",n):=ment+A, 


where m € M is any element satisfying e(m) = m”. Note that if m, € M is any 
other element satisfying «€(m,) = m”, thenm @n—m, @n=(m— mj) @ne A, 
which proves that f : M” x N + (M @r N)/A is well defined. As it is clearly 
balanced, we obtain an abelian group homomorphism 6 : M’”@rN > (M@®rRN)/A 
satisfying 6(m” @ n) = (m @ n) + A, where e(m) = m”. As it is clear that 0 is 
inverse to € ® idy, we conclude that A = ker (€ ® idn). 

Finally, note that since kere = im we have that A C im (js @ idy). Since 
we have already shown that im (44 @ idj) C ker (€ ® idn) = A, it follows that 
im (4 ®@ idy) = ker (€ ® idn), and the proof is complete. 


We hasten to warn the reader that in Theorem 13.3.8 (i) above, even if M’ AM 


a ee . @id| i ee SS a ee 
is injective, it need not follow that M’ @r N a i @pr N is injective. (A similar 


comment holds for part (ii).) Put succinctly, the tensor product does not take short 
exact sequences to short exact sequences. In fact a large portion of “homological 
algebra” is devoted to the study of functors that do not preserve exactness. As an 
easy example, consider the short exact sequence of abelian groups (i.e. Z-modules): 


He 


Z+Z—> Z/2Z > 0, 
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where ji2(a) = 2a. If we tensor the above short exact sequence on the right by Z/2Z, 
we get the sequence 


Z/2L > Z£/2L > L/22. 
Thus the exactness breaks down. 
Motivated by Theorem 13.3.8 and the above example, we call a left R-module N 


flat if for any short exact sequence of the form 


(WSS SO 


the following sequence is also exact: 


. . 
O> M @rN'ES Magn 2S M’ @RN— 0. 


Note that by Theorem 13.3.8, one has that the left R-module is flat if and only if 
for every injective right R-module homomorphism p, : M’ — M, the induced 
homomorphism j1 ®@ idy : M! @g N > M @p Nis also injective. 

The following lemma will prove useful in the discussion of flatness. 


Lemma 13.3.9 Let R bearing, and let F be afree left R-module with basis { f3 | 3 € 

B}. Let M be a right R-module. Then every element of M ®pr F can be uniquely 

expressed as >| mg ® fg, where only finitely many of the elements mg, 3 € B are 
BEB 

nonzero. 

Proof First of all, it is obvious that each element of M @ pr F admits such an expres- 


sion. Now assume that the element 5° mg @ fg = 0. We need to show that each of 
BEB 

the elements mg, ( € B are 0. Since F is free with basis { fg | 9 € B}, there exist, 

for each 3 € B, a so-called projection homomorphism 773 : F +R R such that 


ify #8 


m™3(fy) _ i PB: 


Next, we recall that M@prR = M viatheisomorphismo : M@rR—> M, m@rvw 
mr € M (see Lemma 13.3.3). Combining all of this, we have the following 


0 = a(1m @ 78) dims ® fa 
= a(mg @ 1) 


The result follows. 
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Remark Of course one could easily devise a proof of Lemma 13.3.9 based on right 
distributive properties (see Corollary 13.3.6). However here we are interested in 
retrieving a “flatness” result. 


Theorem 13.3.10 Let R be a ring. If F is a free left R-module, then it is flat. 


Proof Let {f3| @ € B} be a basis of F and let up : M’ > M be an injective 
homomorphism of right R-modules. We shall show that the induced homomorphism 
UL@ilpr: M Or F > M @p F is also injective. Using Lemma 13.3.9 we may 
express a Deiat element of M’ @p F as >) m’; ® f3, where only finitely many of 
the elements m’ 3» 8 € B are nonzero. Thus, if (u ® 1r)(Sim, ® fz) = 0, then 
un 3) ® f3 = 0 € M @p F. But then the uniqueness statement of Lemma 13.3.9 
guarantees that each pu(m 3) = = 0. Since pp: M’ + M is injective, we infer that each 


m’, = 0, and so the original element >’ m’, ® f3 = 0, proving the result. 


13.3.4 The Adjointness Relationship of the Hom and Tensor 
Functors 


In this section, we use some of the elementary language of category theory intro- 
duced in Sect. 13.2. Let R be a ring and let RMod, Ab denote the categories of left 
R-modules and abelian groups, respectively. Thus, if M is a fixed right R-module, 
then we have a functor 

M @r — :r Mod —> Ab. 


In an entirely similar way, for any fixed left R-module JN, there is a functor 
— @r N: Modpr — Ab, 


where Mod, is the category of right R-modules. Next we consider a functor Ab > r 
Mod, alluded to in Sect. 8.4.2, Chap. 8. Indeed, if M is a fixed right R-module, we 
may define 

Homz(M, —) : Ab > pr Mod. 


Indeed, note that if A is an abelian group, then Homz(M, A) is a left R-module via 
r- f)(m) = f(mr). For the fixed right R-module M, the functors M @pr — and 
Homz(M, —) satisfy the following important adjointness relationship: 


Theorem 13.3.11 (Adjointness Relationship) [f M is a right R-module, N is a left 
R-module, and if A is an abelian group, there is a natural equivalence of sets: 


Homz(M ®pr N, A) =ser Homr(N, Homz(M, A)). (13.2) 
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Proof One defines a mapping @ from the left side of (13.2) to the right side in the 
following way. For each f € Homz(M @p N, A) let 


O0(f) : N + Homz(M, A) 
be defined by 
O(f)(n): M > A, where [A(n)](m) = f(m @n) 


for alln € N andme M. 

For the inverse consider an arbitrary abelian group morphism 7 : N —> 
Homz(M, A). Observe that 6*(w) : M x N — A defined by 6*(wW)(m,n) = 
[~(m)](n) is a balanced mapping, and so factors through M @ p N to yield a mapping 
6-!(ab) : Mp ® N = A. The fact that 6~! really is an inverse (that is, 9~! o 6 and 
6067! are respectively the identity mappings on the left and right sides of (13.2)) is 
easily verified. 


In general if C, D are categories, andif F : C ~ D, G: D > C are functors, we 
say that F is left adjoint to G (and that G is right adjoint to F) if there is a natural 
equivalence of sets 


Homp(F(X), Y) =ger Home(X, G(Y)), 


where X is an object of C and Y is an object of D. Thus, we see that the functor 
M ®r — is left adjoint to the functor Homz(M, —). 


13.4 The Tensor Product as a Right S-Module 


In the last section we started with a right R-module M and a left R-module N and 
constructed the abelian group M @ p N satisfying the universal mapping properties 
relative to balanced mappings. In this section, we shall discuss conditions that will 
enable M @p N to carry a module structure. 

To this end let S, R be rings, assume that M is aright R-module, and assume that 
N is an (R, S)-bimodule. In order to give M @ p N the structure of a right S-module, 
it suffices to construct a ring homomorphism 


o: S$ > Endz(M @R N)*; 
where the “*”’ is there to indicate that the endomorphisms are to act on the right, in 
this abselian group. This will allow for the definition of an S-scalar multiplication: 
a-s:=adg(s), a€ M @R N. Foreachs € S define f,: Mx N > M @r N by 
setting fs(m,n) := m@ns, s € S,meM,ne€ N. Then f, is easily checked to 
be a balanced map. By the universal mapping property of the tensor product, there 
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exists a uniqsue abelian group homomorphism ¢; : M @r N > M @pRN satisfying 
os(m @n) =m @ ns. Note that ifm € M andn € N, then 


Os,4)(m @n) = m @n(s1 +52) 
=m ® (ns; + ns) 
=m®&ns;j +m®&ns2 
= ds (M @n) + Os) (m @ n) 
= ($5, + Gs))(m @ n). 


It follows, therefore, that ¢5,45, = @s, + ds). Similarly, one verifies that ¢5,5, = 
(¢s,) « (@s), Where, for composition of right operators, the “dot” indicates that s; 
is applied first and sz second. In turn, this immediately implies that the mapping 
o:S > Endz(M @r N), G(s) := sy is the desired ring homomorphism. In other 
words, we have succeeded in giving M @  N the structure of a right S-module. 

The relevant universal property giving rise to a module homomorphism is the 
following: 


Theorem 13.4.1 Let R and S be rings, let M be a right R-module, and let N be an 
(R, S)-bimodule. If K is a right S-module and if f : M x N — K is a balanced 
mapping which also satisfies 


f(m,ns) = f(m,n)s, sES,meM, nen, 


then the uniquely induced abelian group homomorphism 0: M @r N — K is also 
a right S-module homomorphism. 


Proof Letm € M, n € N,and let s € S. Then 


A((m @ n)s) = O(m @ (ns)) 
= 0ot(m,ns) 
= f(m,ns) 
= f(m,n)s 
= (dot(m,n))s 
= 0(m @n)s. 


Since M@ x N is generated by the simple tensors m@n, m € M, n € N,weconclude 
that ? preserves S-scalar multiplication, and hence is an S-module homomorphism 
M®@rN- K. 


Corollary 13.4.2 (Exchange of Rings) Let R be a subring of S. If M is a right 
R-module, then M ®pr S is a right S-module. 


Proof Since S is an (R, S)-bimodule, it can replace the module N in Theo- 
rem 13.4.1. 
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A frequent example of such an exchange is the enlargement of the ground field of 
a vector space: here an F'-vector space V becomes V @  E, where F is an extension 
field of F.* In fact one has the following: 


Corollary 13.4.3 Suppose E is a field containing F as a subfield. Let |; denote the 
multiplicative identity element of E. If V is an F-vector space, then V @F E isa 
right vector space over E. If X = {x,} is a basis for V then X ®@ 1g := {xo ® 1g} 
is an E-basis for V ® E. Hence 


dimr V = dimg(V ® E). 


The proof is left as Exercise (1) in Sect. 13.13.4. 

Of particular importance is the following special case. Assume that R is a com- 
mutative ring. In this case, any right R-module M can also be regarded as a left 
R-module by declaring that rm := mr, m € M,r é€ R. Furthermore, this specifi- 
cation also gives M the structure of an (R, R)-bimodule. On p. 234 we called such 
a module M a symmetric bimodule. For example, vector spaces over a field are 
normally treated as symmetric bimodules. 

suppose M and N are such symmetric (R, R)-bimodules. Then forallm € M, n € 
N, andr ¢€ R, we can define left and right scalar multiplication of pure tensors by 
the first and last entries in the following equations: 


(m @®n)r =m ® (nr) =m ® (rn) = (mr) @n = (rm) @n =r(m@n). 


Of course, this endows M ®p N with the structure of a symmetric (R, R)-bimodule. 

Note that Theorem 13.4.1 says in particular that if F is a field and if V and W are 
vector spaces over F,, then V @F W is automatically a vector space over F. In fact, 
we can say more: 


Theorem 13.4.4 Let F be a field, and let V and W be F-vector spaces with bases 
{uglo € I}, and{w-,|t € J}, respectively. Then Vr W has basis {vs ®w,| (a, 7) € 
I x J}. In particular, 


dim F(V @F W) = dimp V ¢ dimr W, 


as a product of cardinal numbers. 


Proof Since F-vector spaces are free (F, F')-bimodules, we may write them as direct 
sums of their 1-dimension subspaces. Thus 


V=QouF, W=@Q Fu,-. 


oel Tes 


3 Of course for over a century mathematicians (perhaps in pursuit of eigenvalues of transformations) 
have been replacing F-linear combinations of a basis with E-linear combinations without feeling 
the need of a tensor product. 
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Now by the “distributive laws” (Corollaries 13.3.6 and 13.3.7) we have 


V@rw= Pe Uo F @ Fu,. 
(o,T)Elx J 


Since each summand is an (F,, F)-bimodule, we may write v, ® w-F in place of 
Uo F ® Fw,, and the direct sum of these summands is also an (F, F')-bimodule. Thus 
V pf W isan F-vector space with basis {u, ® w,|(o, 7) € J x J}.0 


13.5 Multiple Tensor Products and R-Multilinear Maps 


Everything that we did in defining a tensor product can also be done in defining 
multiple tensor products. This important topic is the basis of what is called multilinear 
algebra. There are two reasons that it should be introduced at this stage: (1) it is 
needed in Sect. 13.7 to define the tensor product of algebras, and (2) it is needed 
again in a construction of the tensor algebra in Sect. 13.9, and in our discussions of 
the symmetric and exterior algebras. The main idea here is to avoid having to justify 
any kind of associative law of the tensor product. 

Throughout this section, R is a commutative ring and all modules considered are 
(not neccessarily symmetric) (R, R)-bimodules. 

Let {Mj|i = 1,...,n} be a sequence of (R, R)-bimodules, and let A be any 
abelian group (with its operation denoted by addition). A homomorphism of additive 
groups 

a:M,@...€8M,-~A 


is said to be balanced if, for each indexi = 1,...,n,foreachn-tuple (a1,...,dn) € 
@ Mi, and for any r € R, one has 


a((ad,..., Gil, Qj41,---,4n)) = AA, ..., Aj, FAj41, Aj42,---, An). (13.3) 


This simply extends our previous definition of balanced mapping where the parameter 
n was 2. 

Generalizing the recipe for the usual tensor product, we form the free Z-module 
F with basis consisting of the set of all n-tuples in M; x ... M,,* and let B be the 
subgroup of F’ generated by all elements of the form: 


(4ij.256 PP acon) = Cigt aa) = (ies 4p ee) 
(Q1,..., iT, Gi41,---,4n) — (1, ..., Gi, Ai41,-.-, An), 
where i = 1,2,...,n,a;,a; € Mj andr € R. 


4 Again, the reader is warned that addition in F is not addition in G M; when the elements being 
added lie in the latter sum. 


492 13 Tensor Products 


Let aj ®--- @ ay := (a,...,a,) + B, the image of (a1,...,@,) under the 
canonical projection 7 : F — F/B (such an element is called a pure multitensor) 
and we write 

F/B := M, ® M2 ®--:-@ My. 


Then for each i, 


a} @-++@ (aj +:d)) @ +++ @ ay = (a) @--- @ ay) + (Aj @-+ + OA; @-++@ay) 
a1 @ +++ OA OAj+1 @- +O, =A, @B--+ Oj Oraj+1 @--- Oa 


for alla; € Mj, a; € M; andr € R. Thus the mapping 
0: M,x---x M, > F/B 


(a restriction of 77) is balanced. 
Conversely, suppose /3 is a balanced mapping 


M,x:-:-xM,7~ A 


for some abelian group A. Using the universal property of free Z-modules, (3 deter- 
mines a unique group homomorphism B: F = A, and, since / is balanced, ker B 
contains B. Thus by the Composition Theorem of homomorphisms and the two 
applications of the Fundamental Theorems of homomorphisms we have the follow- 
ing sequence of homomorphisms: 


a: F > F/B =M,®---@ Mp, (13.4) 
mg: F/B —> (F/B)/(ker 3/B), (13.5) 
iso, : (F/B)/(ker 3/B) > F/(ker 8), (13.6) 
ison : F /(ker 3) > B(F), (13.7) 
inc: B(F) > A (13.8) 


Here, (13.4) is the afore-mentioned projection, (13.5) is a projection using B C ker B, 
(13.6) is a classical isomorphism, (13.7) is another classical isomorphism, and (13.8) 
is the inclusion mapping. 

Then 


A 


B = inc 0 iso7 0 iso, 0 7B OT, 
which is a detailed way of saying that B factors through the tensor product: 
F>M,®::-@M,— A. 


Recalling that ( is a restriction of B one concludes the following: 
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Theorem 13.5.1 Suppose M,,..., My, A are (R, R)-bimodules Then there is a 
unique balanced map t : M, x --- x My, — M, ®---@® My such that for any 
balanced mapping 

B:M,x---x M, > A, 


There exists a balanced mapping 
0(3): M, ®-:-@M, >A 


such that 3 = @(3) o t—that is, “@ factors through the tensor product”. 


The theorem is only about R-balanced mappings—certain morphisms of additive 
groups. If the component M;’s are symmetric (R, R) bimodules, one naturally real- 
izes that such a theorem can hold so that the mappings involved, t, 3 and 0(3) are 
all morphisms of symmetric (R, R)-bimodules. 


Theorem 13.5.2 Suppose M,,..., Mn, A are (R, R)-bimodules where R is a com- 
mutative ring and rm; = mjrand ra = ar for all (r,,mj) € Rx M; x A,i € 
{1,2,...,n}. Then there is a unique balanced map t : M, x --- x M, > 
M, ®---@ My, such that for any balanced mapping 


B:M,x---x M, > A, 
There exists an (R, R)-bimodule homomorphism 
(3): M, @---@M, >A 


such that 3 = @(3) o t—that is, “@ factors through the tensor product”. 


Remark Itisimportant for the student to realize that in this development, the multiple 
tensor product is not approached by a series of iterations of the binary tensor product. 
Rather, it is directly defined as a factor group F'/B—a definition completely free of 
parentheses. 


13.6 Interlude: Algebras 


Let A be a right R-module, where, for the moment, R is an arbitrary ring (with 
identity 1p). We say that A is an R- algebra if and only if A is both a ring and that 
the R-scalar multiplication satisfies the condition 


(ab)r = a(br) = (ar)b, foralla,be A,reR. (13.9) 
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Note that as a consequence of the above requirement, we have, foralla,b € A, r,s € 
R, that 
(ab) (rs) = (a(br))s = ((as(br) = ((as)rb = ((a(sr))b. 


Moreover if we set b = 14, the multiplicative identity of A, in the above equation, 
then we obtain the fact that a(rs) = a(sr) for alla € A and all, sR. Therefore the 
annihilator 7 := Annr(A) contains all commutators rs — sr € R, from which we 
conclude that A is an R/J-module. Noting that R/J is a commutative ring, we see 
that in our definition of R-algebra, we may as well assume—and we shall—that R is 
commutative from the outset. Note that with this convention in force, the R-algebra 
A is asymmetric (R, R)-bimodule via the definition ra := ar, ae A, re R. 
We consider some familiar examples: 


Example 57 Let F be a field and let M,,(F’) be the ring of n x n matrices over F.. 
Then the usual F-scalar multiplication with matrices is easily checked to provide 
M,,(F) the structure of an F-algebra. 


Example 58 Let F-be a field, let V be an F-vector space and let Endr(V) be the 
set of linear transformations V — V. This is already an F-vector space by the 
definitions 

(T + S)(v) := T(v) + S(v), (Ta)(v) := T(va), 


T,S € Endr(V), v € V, a € F. Multiplication of linear transformations is just 
composition: (7S)(v) := T(S(v)), T, S € Endr(V), v € V, which is again a linear 
transformation. This gives Endr(V) a ring structure. Finally, one checks that for 
all T,S € Endf(V) and for all a € F, (TS)a = T(Sa) = (Ta)S by showing 
that these three linear transformations have the same effect at all v € V. Therefore, 
End (V) is an F-algebra. 


Example 59 Let R be a commutative ring, let X be a set and let R* be the set of 
functions X — R with ring structure given by point-wise multiplication: 


(f + g(x) = f) +9), G)@) = f@)g@), forall f.ge R*, x EX. 
An R-scalar multiplication can again be given point-wise: 
(f -a)(x):= f(x)a, fEeR*,x EX, aeR. 


The reader should have no difficulty in verifying that the above definition endows 
R~* with the structure of an R-algebra. 


Example 60 The Dirichlet Algebra was defined as a certain collection of 
complex-valued functions on the set of positive integers. Multiplication was a sort 
of convolution (see Example 43 on p. 216 for details). Of course one can replace the 
complex numbers by any commutative ring R in the definition of “Dirichlet multi- 
plication” on the R-valued functions on the positive integers, to obtain an R-algebra. 
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Example 61 Let F bea field, and M be a monoid. Then the F-monoid ring FM admits 
an F-scalar multiplication via pointwise scalar multiplication: if f ¢ FM, a € F, 
and m e€ M, set (fa)(m) := f(m)a. Perhaps this is made more transparent by 


writing elements of M in the form f : >) am, where the usual convention that 
meM 
Qm # 0 for only finitely many m € M, and defining 


= > QAM. 


meM 


This gives FM the structure of an F-algebra. We hasten to emphasize two very impor- 
tant special cases of this construction: the polynomial rings and group rings (over a 
finite group). Thus, we can—and often shall—refer to such objects as polynomial 
algebras and group algebras. Note that the F-group algebra FG over the finite 
group G is finite dimensional over F’. 


Note that if R is a commutative ring and A is an R-algebra, then any ideal J C A 
of the ring A must also be an R-submodule of A. Indeed, we realize that if 14 is 
the identity of A, and if a € R, then we have, for all x € J, that xa = (xla)a = 
x(14q@) € I, as I is an ideal. From this it follows immediately that quotients of A 
by right ideals are also R-modules. Furthermore, a trivial verification reveals that 
quotients of A by two-sided ideals of A are also R-algebras. 

We shall conclude this short section with a tensor product formulation of R- 
algebras. Thus, we continue to assume that R is a commutative ring and that A is 
an R-module. First of all, notice that if A is an R-algebra, then by Eq. (13.9) the 
multiplication A x A — A in A is R-balanced, and hence factors through the tensor 
product, giving an R-module homomorphism pp : A®pz A — A. The associativity of 
the multiplication in A translates into the commutativity of the diagram of Fig. 13.1. 

At the same time, the identity element | € A determines an R-linear homomor- 
phism 7 : R — A, defined by 7(@) := 14 -a € A. The fact that | is the identity of 
A translates into the commutativity of the diagrams of the following figure, where 
€:R@A-— Aande’: A®R -— Aare the isomorphisms given by 


ea @a)=aa, &(a®@a) =aa, 


acA,aeR. 


Fig. 13.1 Diagram for A@rpA@RA 
associativity 
Jo @ id ids @u 
A@RrA A®@rA 


YA 
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Fig. 13.2 Diagrams for a A@RrA A@RrA 
two-sided identity 


n@ id, bl [i id, @ 


€ € 


R®A— A A 


A@R 


Note that the distributive laws are already contained in the above. Indeed, if 
a,b,c € A, then using the fact that pp : A@pr A — A isan abelian group homomor- 
phism, we have 


a(b+c) = p(a® (b+ c)) = wa @®b+a@c) = wla®b)+ ula ®c) =ab+ac. 
Conversely, if A is an algebra over the commutative ring R, and if a multiplication 


ju: A@RA — Ais given which makes the diagrams of Figs. 13.1 and 13.2 commute, 
then jz gives A the structure of an R-algebra (Exercise (1) in Sect. 13.13.5). 


13.7 The Tensor Product of R-Algebras 


Throughout this section, we continue to assume that R is a commutative ring. If 
A, B are both R-algebras, we shall give a natural R-algebra structure on the tensor 
product A @pr B. Recall from Sect. 13.4 that A ®pr B is already an R-module with 
scalar multiplication satisfying 


(a®bja = a®ba = aa@b, 


acA,bEeB,aeR. 
To obtain an R-algebra structure on A @r B, we map 


f:AxBxAxB—A®QRB, 


by setting f(a, b1, a2, b2) = ajaz ® bib2, aj, a2 € A, bi, bz € B. Then, as f is 
clearly multilinear, Theorem 13.5.2 gives a mapping 


He: (A @p B) @r (A @r B) + ASRB 
such that f = 1 0 t, where ¢ is the standard mapping 


t:AxBxAxBD A®Q®BOABOB. 
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Thus, we define the multiplication on A @,z B by setting xy := w(x @ y), x,y € 
A @p B. Note that on simple tensors, this multiplication satisfies (a @ b)(a’ ® b') = 
aa’ @ bb’, a,a' € A, b, b’! € B. One now has the desired result: 


Theorem 13.7.1 Let A, B be R-algebras over the commutative ring R. Then there 
is an R-algebra structure on A@ Rr B such that (a@b)-(a' @b') = aa’ @bb’, a,a' € 
A,b,b EB. 


Proof It suffices to prove that the Figs. 13.1 and 13.2 are commutative. In turn, to 
prove that the given diagrams are commutative, it suffices to verify the commutativity 
when applied to simple tensors. Thus, let a, a’, a’ € A, b, b’, b” € B. We have 


(UL ® idagpp(a®be@a' @b' @a" @b") = pad’ @ bb' @a" @b") 
= (aa’)a" & (bb')b” 
= a(a'a") ® b(b'b") 
= wa @b@a'a" @b'b") 
= W(idagrs @W)(a@b@a' @b' @a" @b"), 


proving that Fig. 13.1 commutes. Proving that the Fig. 13.2 also commute is even 
easier, so the result follows. 


13.8 Graded Algebras 


Let (M, -) be a (not necessarily commutative) monoid, let R be a commutative ring, 
and let A be an R-algebra, as described in the previous sections. The algebra A is 
said to possess an M-grading if and only if 


Gi) A= Osc A,. a direct sum of R-modules, A,, indexed by M. 
Gi) For any 0,7 € M, 


Ag Ar = {agr|(dg, az) € Ag X Az} © Agr 


Any R-algebra A that possesses a grading with respect to some monoid M is 
called a graded algebra. 

The elements of A, are said to be homogeneous of degree o with respect to 
the M-grading, and the submodules A, themselves will be called homogeneous 
summands.° Note that by this definition, the additive identity element 0 € A is 
homogeneous of every possible degree chosen from M. 


Example 62 We have already met such algebras in Sect.7.3.2. The monoid alge- 
bras FM, where F is a field, for example, have homogeneous summands that are 


5 Although “homogenous summand” is not a universally used term, itis far better than “homogeneous 
component” which has already been assigned a specific meaning in the context of completely 
reducible modules. 
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1-dimensional over the field F. (The converse is not true. There are M-graded 
F-algebras with 1-dimensional summands, which are not monoid rings.) 

We also should include in this generic example the polynomial rings D[x] and 
D[X] where D is an integral domain and the respective multiplicative monoids 
are respectively the powers of x, or all monomials x{1x5? +--+ x7", ai,n € N, in 
commuting indeterminates x; € X° Note that we must use an integral domain in 
order to maintain property (i), that A, A; C Ao.7. 

But this genus of examples also includes examples where the monoid (M, -) is 
not commutative: 


1. The polynomial ring D{X} in non-commuting indeterminates, where the relevant 
grading is provided by the free monoid on X (consisting of two or more symbols) 
whose elements are words in the alphabet X. 

2. The group ring DG where G is a non-commutative group. 


In these examples, which were discussed in Sects. 7.3.2 and 7.3.5, the homoge- 
neous summands are of the form Dm (one-dimensional if D is a field). 

The following observation provides examples of graded D-algebras whose homo- 
geneous summands are not of this form. 


Lemma 13.8.1 Let A be an D-algebra graded with respect to a monoid (M,-). 
Suppose @ : (M,-) > (M’,-) is a surjective homomorphism of monoids. For each 
element m' of the image monoid M’, define 


Am! = > Ag. 


Y(o)=m' 


Then the direct decomposition A = ®n'<yr Am’, defines a grading of A with respect 
to the image monoid M'. 


Proof We need only show that if m‘, and m4, are elements of M’, then Am Am, S 
Am! -mi,. But if (mj) = m.,i = 1,2, then d(m, - m2) = mm, since ¢ is a monoid 
morphism. Thus Aj; Am © Am! -m’, for all m; such that @(m;) = m;,,i = 1, 2. Since 
An! = gin)=m! Am. the result follows. 


i 


In effect, the monoid homomorphism ¢ produces a grading that is more “course” 
than the original M-grading. 


Example 63 The (commutative) polynomial ring R[X], where R is any commutative 
ring, is graded by the multiplicative monoid M* (X) of finite monomials chosen from 
X. As noted in Sect. 7.3.5, p. 205, there is a monoid morphism 


deg : M*(X) > (N, +) 


Recall that these monoids are respectively isomorphic to (N, +) and the additive monoid of all 
finite multisets chosen from X. (Once again, we remind the beginning reader that the set of natural 
numbers, N, includes 0.). 
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which is defined by 
n n 
deg | [x = Sai 
i=l i=l 
for any finite subset {x1,...,x,} of X. The image deg m is called the degree of m, 


for monomial m € M*(X). 


Assume that A, A’ are both M-graded algebras over the commutative ring R. An 
algebra homomorphism ¢ : A — A’ is said to be a graded algebra homomorphism 
if foreach m € M, (Am) © Alj,,. When A is an M-graded algebra over R, we shall 
be interested in ideals J C A for which R/T is also M-graded and such that the 
projection homomorphism A — A/T is ahomomorphism of M-graded algebras. 

For example, suppose A = D[x], the usual polynomial ring, and the ideal J is 
Ax", n > 1. Then is one writes Aj := x!D,0 <i <n-—1, then 


A/T = (Ao+ D/1@(A1+ D)/1@---@ (An-1+ D/1 


is a grading on A/J by the monoid Z/nZ. 

On the other hand assume that D = F isa field, and let J be the ideal of A = F[x] 
generated by the inhomogeneous polynomial x + x7. By the division algorithm, we 
see that the quotient algebra A/J has dimension 2 over the field F’. 

Assume, however, that A/J is graded by the nonnegative integers, and that A > 
A/T is ahomomorphism of graded algebras. This would yield 


[oe 


A/L = >) (Am + D/L 


m=0 


Since, for each nonnegative integer m, Ay, = F - x’, we would have (A,, + [)/I = 
F(x" + 1) #0, since x” cannot be a multiple of x + x”. In turn, this would clearly 
imply that A/J is infinite dimensional, contradicting our prior inference that A/J 
has F-dimension 2. Thus A/J is not a graded algebra in this case. 
Motivated by the above, we call the ideal J of the M-graded R-algebra A homo- 
geneous if 
A/T = @ (An + D/L 


meM 


Note that this implies both that A// is M-graded via (A/I)m = (Am + I)/I and 
that the projection homomorphism A — A/T is a homomorphism of M-graded 
R-algebras. 


Theorem 13.8.2 Let A be an M-graded R-algebra, and let I be an ideal of A. Then, 
the following are equivalent: 


(i) I is homogeneous, 


(ii) T= <>) (Am a) I), 


meM 
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(iii) I is generated by homogeneous elements. 


Proof Assume (i), i.e., that J is homogeneous; thus 


A/I = ‘ap (Am + 2/I. 


meM 


Let x € J and write x asx = >) xm, where each x € Am. We shall show that, in 
meM 
fact, each x,, € I. For otherwise, we would have 


0O+7=x4+/= > mirt= >) Gm + D. 
meM meM 


Since the last sum is direct, we conclude that each x, + J = 0+ J, 1.e., that each 


Xm € I. This implies that 7 = >° (Am). Since this sum is obviously direct, we 
meM 
infer (ii). 
If we assume (ii), and if elements a,, € A, are given with 


>! Gn +1 = 041 € A/T, 


meM 

then 
Yan el= >) (Am 1D). 
meM meM 


This clearly forces each dm € (Am MT) C€ I, and so the sum >> (Am + 1)/T is 


direct. As it clearly equals A/J, we infer condition (i). Thus, senna (i) and (ii) 
are equivalent. 

If we assume condition (ii), then it is clear that J is generated by homogeneous 
elements, as each A,, I, m € M consists wholly of homogeneous elements, so 
(iii) holds. 

Finally, assume condition (iii). The homogeneous elements that lie in the ideal J 
form a set S = Uszey(IM Az), which, by hypothesis, generates J as a 2-sided ideal. 
Let 

H:= ‘ap H, where H, = IN Ag, (13.10) 
oeM 


so that H is the additive group generated by S. Clearly H C I. Since part (ii) of 
the definition of an M-grading requires that homogeneous elements be closed under 
multiplication, we see that if 4 is an arbitrary homogeneous element of A, then 


hSUShC S, 
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since the products to the left of the containment symbol are homogeneous elements 
of A lying in the ideal J. It follows that hH and Hh both lie in H for any homogeneous 
element h chosen from A. Since the additive span of these homogeneous elements is 
A itself, the distributive law of the ring yields AHA C H, so H is itself a two-sided 
ideal of A. Since H contains S the ideal-generators of 7, we have J C H. Since 
H CI wealso have H = J and now Eq. (13.10) produces 


I= QUnA,), 


oeM 


proving (ii). 
Thus all three conditions (i), (ii), and (iii) are equivalent. 


13.9 The Tensor Algebras 


13.9.1 Introduction 


In this section and the next we hope to introduce several basic algebras with important 
universal properties. There are two ways to do this: (1) first define the object by a 
construction, and then show it possesses the desired universal property, or (2) first 
base the definition upon a universal mapping property, and then utilize our categorical 
arguments for uniqueness to show that it is something we know or can construct. 
We think that the second strategy is particularly illustrative in the case of the tensor 
algebra. 


13.9.2 The Tensor Algebra: As a Universal Mapping Property 


Fix a vector space V over a field F. The tensor algebra of V is defined to be 
a pair (4, 7(V)) where 1 : V — T(V) is an injection of V into an F-algebra, 
T(V) such that every F-linear transformation t : V — A into an F-algebra A 
uniquely factors through .—that is, there exists a unique F-algebra homomorphism 
A(t) : T(V) — A, such that 6(t) extends f. In other words t = 0(T) 01 and we have 
the commutative triangle below: 


NA 


A 


T(V) 
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Theorem 13.9.1 Define the category whose objects are the pairs (f, A), where A 
is an F-algebra, and f : V — A is an F-linear transformation, for a fixed vector 
space V. In this category, a morphism from object (f, A) to object (f’, A’) is an 
F-algebra homomorphism ¢ : A — A’ such that do f = f'. Then (1, T(V)) (if it 
exists) is an initial object in this category. 


Proof This is just a restatement of the universal property described above. 
But now the “abstract nonsense”, has a wonderful consequence! 


The tensor algebra (1, T(V)) (if it exists) is unique up to isomorphism. 


But it does exist and we have met it before. Let X be any basis of the vector 
space V. We may regard V as a vector subspace of the polynomial ring F {X} in non- 
commutting indeterminates X—namely as the subspace spanned by the monomials 
of total degree one. We let tx : V <> F{X} denote this containment. Let A be 
any F-algebra, and let t : V — A be any linear transformation. Now we note 
that if 14 is the multiplicative identity element of A, then the subfield 1, F lies in 
the center of A. Now we apply Exercise (6) in Sect.7.5.3, with (t, t(X), Fla, A) 
in the roles of (a, B, R, S) to conclude that the restricted mapping t : X > A 
extends to a ring homomorphism (actually an algebra homomorphism in this case) 
Ey : F{X} — A, which we called the “evaluation homomorphism”. It follows that 
it induces t : V — A when restricted to its degree-one subspace V (see p. 205). We 
also note that the evaluation mapping E* : F{X} — A was uniquely determined 
by ¢. 


Thus the pair (tx, F{X}) satisfies the definition of a tensor algebra given above. 
We record this as 


Corollary 13.9.2 Let V be a vector space with basis X. Then the tensor algebra is 
isomorphic to the polynomial algebra F{X} in non-commuting indeterminates X. 


13.9.3 The Tensor Algebra: The Standard Construction 


In this construction we utilize the (parenthesis free) construction of the multiple 
tensor product defined in Sect. 13.5. 

Let F be a field and let V be an F-vector space. We define a sequence T’(V) of 
F-vector spaces by setting T°(V) = F, T'!(V) = V, and in general, 


; 
T'(V) = ®V = V@r®--- @r V (7 factors ). 
i=l 


Set T*(V) := Qr_ T’ (V) and define an F-bilinear mapping 


be l*WVyx TV) > TV) 
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by setting 
[o,@) CO CO 
(X Tr, De) => > sever mcr. 
r=0 s=0 k=0 r+s=k 


This is easily verified to endow T*(V) with the structure of an F'-algebra, called the 
tensor algebra over F. When V has a basis X, then T*(V) has an F-basis consisting 
all pure tensors of the form x; ®---@xqg, where d € N, the x; are in X and an empty 
product is taken to be the identity element e of T*(V). In terms of this basis of pure 
tensors the multiplication follows this rule: 


HY @ +++ B vp, Vat @ +++ @ atm) = V1 @ +++ @ Unim. (13.11) 


Since this is simply a concatenation of pure multitensors, the operation is asso- 
ciative. 

Letv: V=T!(V) > T*(V), be the (injective) containment mapping. We shall 
show that the pair (, T*(V)) satisfies the universal mapping property of our previous 
definition of a tensor algebra T(V). 


Theorem 13.9.3. Fix an F-vector space V. Define the category whose objects are 
the pairs (A, f), where A is any F-algebra, and f : V — A is any F-linear 
transformation. A morphism from (A, f) to (A’, f") in this category is an F-algebra 
homomorphism ¢ : A — A’ such that do f = f'. Then (1, T*(V)) (the embedding 
of V into it’s tensor algebra) is an initial object in this category. 


Proof Let (A, f) be an object in this category. Note that if ¢ : T(V) > A is an 
F-algebra homomorphism such that ¢ 0. = f, then for all nonnegative integers r, 
we must have @(v1 @ v2 @--- @ v,) = f(vi)f(v2)--- f(u-). Since any element 
of T(V) is a sum of such simple tensors, we must have that 6 : T(V) — A must 
be unique. It remains to prove the existence of such a mapping ¢. First define, ¢o : 
T°(V) = Fe > F 14, to be the F-linear mapping which takes the identity element 
e of T*(V) to 14, the identity element of the algebra A. Now, by Theorem 13.5.2, 
for each positive integer r > 0, the F-balanced mapping 


fe VXVxX--xV > A, where f(01, 025-05 dr) = fi) f (v2) + fy). 
induces a unique F-linear mapping 


or: T'V)=V @r V @f @r::-@rV >A, 


504 13. Tensor Products 


where @; (v1 © v2 @--: @v,-) = f(vi) f(v2)--- f(v;). In turn, by the universality 
CO 


of the direct sum T*(V) = @ T'(V), we then get a unique F-linear mapping 
r=0 


$:T*(V) = ‘ap TV) +A 
r=0 


which is an F-algebra homomorphism. This proves the existence of ¢, and the result 
follows. 


At this point, we know from the uniqueness of initial objects that the three objects 
T(V), F{X} (where X is a basis of V) and T*(V) are all three the same object.’ 
In fact, the isomorphism T*(V) — F{X} is induced by the following mapping 
connecting their basis elements: 


et» I, the monic polynomial of degree zero 


X1 @+++@Xp_ b> X1xX2-+++Xn, a mMonomial of degree n 


(Here n is any positive integer and the x; lie in X, a basis for V.) 
From now on we will write T(V) for T*(V). 


13.10 The Symmetric and Exterior Algebras 


13.10.1 Definitions of the Algebras 


Inside the tensor algebra T (V ) over the field F’, we define the following homogeneous 
ideals (cf. Theorem 13.8.2): 


(i) I C T(V), generated by all elements of the form v @ w— w@®v, v,w EV; 
(ii) J C T(V), generated by all simple tensors of the form v ®@ v, v € V. 


In terms of the above ideals, we now define the symmetric algebra over V to 
be S(V) := T(V)/T. Likewise, we define the exterior algebra by setting E(V) := 
T(V)/J. By Theorem 13.8.2, these are both graded algebras. Thus, 


S(V) = ‘ap S’(V), where each S’(V) = (T'(V) + 1)/I, 
r=0 


7Up to isomorphism, of course. 
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and 
[o.@) 


E(V)=@E'(V), where each E”(V) = (T’(V) + J)/J. 
r=0 


(Many authors write /\ V for E(V) and then write A’ V for E’(V).) 
We denote by us : V > S(V) and by up : V > E(V) the F-linear mappings 


ts(v) := vt] e€eS(V), tev) :=v+JeEE(V), ve V. 


Since it is clear that VM 7 = VM J = 0, we see that vs and vg are both injec- 
tive F’-linear transformations. We shall often identify V with their respective image 
subspaces ts(V) C S(V) andug(V) C E(V) and write V C S(V) and V C E(V). 

Multiplication in $(V) is typically written simply as juxtaposition: ifa, b € S(V), 
then the product of a and b is denoted ab € S(V). We note that S(V) is acommutative 
algebra. Indeed, any element of S(V) is an F-linear combination of elements of the 
form v1 v2 --- v, (thatis, the product vj ®v2®---@v,+/), where vj, v2,..., v; € V. 
Now write 

a i= vV1{v2°++ Ur, b= Wi W2°++ Ws, 


where v1,..., Ur, W1,..., Ws € V. Since for any vectors v,w © V we have 
vw = wy, it follows immediately that ab = ba and so it follows easily that S(V) is 
commutative, as claimed. 

In the exterior algebra, products are written as “wedge products”: ifc,d € E(V), 
their product is denoted c A d. It follows immediately that for all vectors v, w € V, 
one has v A v = 0, and from (v + w) A (v+ w) = 0 one obtains v A w = —w Av. 

Like the tensor algebra, the universal properties satisfied by the symmetric and 
exterior algebras involve an F-linear mapping V —> A into an F-algebra A. Such an 
F-linear mapping is said to be commuting if and only if f(u) f(v) = f(v) f (wu) for 
all u, v € V—that is, the image f(V) generates a commutative subring. Similarly 
such a mapping is said to be alternating if and only if f(v)? = 0, forall v € V. 


Theorem 13.10.1 Let F be a field, let V be a vector space over F, and let 
S(V), E(V) be the symmetric and exterior algebras over V, respectively. 


(i) Consider the category of pairs (A, f) of F-algebras and commuting F-linear 
mappings f : V — A.Amorphism from the pair (A, f) to the pair (A', f’) isan 
F-algebra homomorphism $ : A — A’ such that do f = f’. Then (S(V), ts) 
is an initial object in this category. 

(ii) Consider the category of pairs (A, f) of F-algebras and alternating F -linear 
mappings f : V — A.Amorphism from the pair (A, f) to the pair (A’, f’) isan 
F-algebra homomorphism ¢: A — A’ such that bo f = f'. Then (E(V), tz) 
is an initial object in this category. 


We shall leave the proof to the motivated reader in Exercise (2) in Sect. 13.13.7. 
The uniqueness of these algebras follows. Whenever one hears the word “unique- 
ness” in a categorical setting, one knows that isomorphisms are being implied, and so 
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anumber of corollaries can be expected to result from the preceding Theorem 13.10.1. 
We produce some of them in the next subsection. 


13.10.2 Applications of Theorem 13.10.1 


A Model for S(V) 


We begin with this example: From the uniqueness assertion in Theorem 13.10.1, 
one can easily exploit the evaluation morphism of polynomial rings (in commuting 
indeterminates) to prove the following: 


Corollary 13.10.2 Let X be an F-basis for the vector space V. Then S(V) = F[X] 
as F -algebras. 


The proof is simply that the universal property of the evaluation mapping Ey 
for polynomial rings in commuting indeterminates proves that the pair (Lx, F[X]) 
satisfies the universal mapping property of the tensor algebra. (See Sect.7.3 for 
background and Exercise (8) in Sect. 7.5.3 for the universal property. A formal proof 
is requested in Exercise (3) in Sect. 13.13.7.) 


A Model for E(V) 


Let (X, <) denote a totally ordered poset bijective with a basis X of the vector space 
V. Let C be the collection of all finite ascending subchains (including the empty 
chain, @) of (X, <). (Clearly this collection is in one-to-one-correspondence with 
the finite subsets of X. It forms a boolean lattice (C, <) under the refinement relation 
among chains. Thus the “join” c, V c2, is the smallest chain refining both c; and c2.) 
For every non-empty ascending chain c = {x, < --- < x,} € C, let we, be the 
word x1x2---x; (viewed either as a word in the free monoid over X or simply as a 
sequence of elements of X). For the empty chain @ we denote w¢ by the symbol e. 
(For convenience, we shall refer to these w, as “chain-words”.) 
Let A be the F-vector space defined as the formal direct sum 


E= BD Fue. 


cEeC 


We convert A into an F-algebra by defining multiplication between basis elements 
and extending this to a definition of multiplication ( “’’) on A by the usual distributive 
laws. In order to accomplish this we consider two chains a = {x, <--- < x,} and 
b={y, <--+ < ys}. Weletm(a, b) be the number of pairs (xx, yj) such that x, > y;. 
Then if the two chains a and b possess acommon element, we seta*b := 0. Butif the 
chains a and b possess no common element, their disjoint union can be reassembled 
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into a properly ascending chain c = a V b, the smallest common refinement of the 
chains a and b. In that case we set 


a*b:= (-1)"™@) we. 


(Note that (—1)’"“) is simply the sign of the permutation z(a, b) of indices that 
permutes the concatenated sequence Wg - Wp tO Wavp-) 

Two make sure that the student gets the sense of this, note that when (X, <) = 
{1 <2 <.--- <9}, we have 


(<2 <4<7)*(©G <6 <7) =0 while 
d<2<4<7)*G8<6<8)=-dU<2<3<4<6<7 <8). 


To show that “x” is an associative binary operation for A it is enough to check 
it on the basis elements—the chains. Thus we wish to compare two products (a * 
b) x c and a * (b * c) where Wg, Wp, We are the respective ascending sequences 
(aj,..-., a7), (bj ..., bs), (C1, ..., Cr). We may assume sr, s, ¢ are all positive integers 
and that a, b and c pairwise share no common element (otherwise (a * b) *c = 0 = 
a * (b * c)). Form the concatenation of sequences 


Wa * Wh: We = (aj,..., 4p, b1,..., bs, C1, ---, Cr) = O15 +++ Wrtstn) I= Ys 
and let 7 be the permutation of indices so that { ore ao is in ascending order, 
thus realizing the chain a Vv b V c. (We view the group of permutations of sequences 
as acting on sequences from the right, so that a factorization of permutations 71772 
means 7 is applied first.) Now the permutation 7 has two factorizations: (i) 7 = 
m(a, b)- a(a V b,c) and (ii) t = m(b, c)7(a,b V c). Here z(a, b) fixes the last t 
indices to effect wg - Wp: We > Wavb: Wc, while m(a V b, c) rearranges the entries of 
the latter sequence to produce Wavpve. So (1) holds. Similarly, we could also apply 
m1(b, c) to the last s + ¢ entries in y = Wag - Wp - We while fixing the first 7 entries to 
get Wq * Wpvc and then apply (a, b V c) to obtain the ascending sequence Wavpve. 
Thus the factorization (ii) holds. 

Now (a* b) *c = ax (b*c) = sgn(t)(a V b V c) follows from the associativity 
of the “join” operation in (C, <), the definition of “x”, and 


sen(m@) = sen(m(a, b))sgn(m(a V b, c)) = sgn(r(b, c))sgn(m(a, b V c)), 


given to us by the factorizations (i) and (ii). 

Now, extending the binary operation * to all of A by linearity from its definition 
on the basis C, we obtain an F’-vector space A with an associative binary operation 
“x”? which distributes with respect to addition and respects scalar multiplication from 
F on either side. One easily checks that e xa = a * e =a foreacha € A, soe isa 


multiplicative identity. Thus (A, *) is an algebra. 
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Notice that the lengths of the chains produce a grading A = @ Ag where, for 
every natural number d, Ag is the subspace spanned by all we as c ranges over the 
chains of length d. Thus Ap = Fe and A; * As C A;4+5. 

At this stage we see that (A, *) is a genuine graded F-algebra which we call the 
algebra of ascending chains. Its dimension is clearly the number of finite subchains 
of (X, <), and if V has finite dimension n then A has dimension 2”. 

There is a natural injection 14 : V — A mapping linear combinations of basis 
elements x; of V to the corresponding linear combination of chain-words w; := wi,;} 
of length one. 

If xo < x; is a chain, and wy, = t4(x,) and w; = /,(x;) are the length-one 
chain words {x,} and {x-}, we have 


2 


we = w? = 0 (writing a’ fora *a ) and 


Wo * Xp = —Wr * XG. 


This gives us w? = 0 for every vector v € V,andsoz, : V > A isan alternating 
mapping as defined above. Thus the universal property of E (V ) requires that there is a 
unique algebra epimorphism « : E(V) — A extending v4. If V had finite dimension, 
we would know at once that € is an isomorphism. Instead we obtain the isomorphism 
by showing that (A, 14) possesses the desired universal mapping property. 

Suppose f : V —> (B, 0) is an alternating mapping into an F-algebra B having 
identity element |g. Then of course 


f(v)* =0, and (13.12) 
fu) o f(v) = — fv) o fw) (13.13) 
for all vectors u, v € V. 

We then define an F'-linear mapping wf : A — B by describing its values on a 
basis for A. First we set a ¢(e) = 1g. For each finite ascending chainc = (x],..., Xn) 
in (X, <) we set 

af (We) = f (x1) +++ f(%n). 
This can be extended to linear combinations of the x; to yield 
af(ta(v)) = f(v), forallv e V. 
We also note that for any two finite ascending chains a and b in (X, <), we have 
af (Wa * Wp) = f(Wa) oO f (wo). 


So once again extending by linearity, we see that 


af: A>B 
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is an algebra homomorphism and that af ov, = f. Thus A possesses the uni- 
versal property which defines the exterior algebra. By the uniqueness asserted in 
Theorem 13.10.1, we have a companion to Corollary 13.10.2: 


Corollary 13.10.3 Suppose V is any F-vector space. Then the exterior algebra 
E(V) is isomorphic to the algebra (A, *) of finite ascending chains of a total ordering 
of a basis X of V. 

In particular, if V is an F-vector space of finite dimension n, then the exterior 
algebra E(V) is isomorphic to the algebra of ascending chains, (A, *) based on the 
chain N = {1 <--- <n}. It is a graded algebra of dimension 2”. 


Morphisms Induced on S(V) and E(V) 


Suppose t : V — V is any F-linear transformation of V. Then, vs of is acommuting 
mapping V — S(V) and so, by the universal mapping property of S(V) (placing 
(ts ot, S(V)) in the role of (f, A)), we obtain an algebra homomorphism 


S(t): SV) > S(V) 
which forces the commutativity of the following diagram: 


sev) 22. sv) 


is] Js 
y ot “y 
The algebra homomorphism S(t) : S(V) — S(V) preserves the grading and so 


induces morphisms 
S’(t): S'(V) > S'(V) 


taking any product of elements of V, vj v2--- Un, to vj ---v), where v’ denotes the 
image of the vector v € V under the linear transformation rf. 
It is apparent that if s,¢ € hom (V, V), then 


S’(sot)=S'(s)o S’(t). 


The same journey can be emulated for the exterior algebra E(V) using its par- 
ticular universal mapping property. Given F-linear t : V — V one obtains the 
alternating mapping tz 0 t — E(V) and so by the universal mapping property, a 
commutative diagram 
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RV) > EW) 


ue] Ju 
vo ate 


where the algebra homomorphism E (rf) at the top of the diagram preserves degrees 
and so induces mappings 
E'(t): S'\(V) > E'(V) 


taking a product of vectors in V, vj Av2 A+++ A Up to vj Arts A Uh, where v’ denotes 
the image of an arbitrary vector v under the linear transformation t. Again 


E'(sot)=E"(s)o E'(t). 
We collect these observations in this way: 
Corollary 13.10.4 Suppose t € homp(V, V). Let 


ts: V = S(V), and 

lp: V—> E(V) 
be the canonical embeddings of V into the symmetric and exterior algebras, respec- 
tively. 


Then there exist F-algebra homomorphisms 


S(t): S(V) > S(V) and 
E(t): E(V) > E(V) 


such that one has 


S(t) ols 
E(t)oltg = wgot, 


ts ot, and 


respectively. Both S(t) and E(t) preserve the algebra grading and so induce map- 
pings 


S(t): S’(V) > S"(V) and 
E'(t): E'(V) > E"(V). 


For any r-tuple (v1, ..., Up) € V” one has 


S"(t)(V1 +++ Un) = vyuh-++ vl, and 


E'(t)((vy A+++ Ave) = ay Avs Avt 


re 
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Finally if the symbol ® denotes either T, T", S, S’, E, or E", one has 
D(s ot) = B(s) 0 Df). (13.14) 


Remark Of course by now even the beginning student has noticed that T,T’, 
S, S’, E, and E’—all of them—are functors V — V where V is an appropriate 
sub-category of Vect, the category of all F-vector spaces. 


There are other consequences of Corollary 13.10.4. For example, ift : V — V is 
an invertible linear transformation, then so are S(t), S’(t), E(t) and E(t). Thus 


Corollary 13.10.5 [ft : V — V is invertible, then S(t) and E(t) are automorphisms 
of the F-algebras S(V) and E(V), respectively. 
In particular, there are injective homomorphisms 


ps : GL(V) — Aut(S(V)) 
pre: GL(V) > Aut(E(V)) 


whose images elements preserve the grading. 
As a result, for any subgroup G of GL(V) one obtains group representations 


py. G > GL(S"(V)) 
pip. G > GL(E'(V)) 


Remark All of the results above were consequences of just one result: Theo- 
rem 13.10.1. There is a lesson here for the student.® It is that universal mapping 
properties—as abstract as they may seem—possess a real punch! Suppose you are in 
a situation where something is defined in the heavenly sunlight of a universal map- 
ping property. Of course that does not mean it exists. But it does mean, that any two 
constructions performed in the shade which satisfy this universal property (whether 
known or new) are isomorphic (a consequence of the “nonsense” about initial and 
terminal objects in a category). So we prove existence. What more is there to do, 
one might ask? The answer is that one does not so easily depart from a gold mine 
they are standing on. Here are three ways the universal characterizations produce 
results: (1) It is a technique by which one can prove that two constructed algebraic 
objects are isomorphic—simply show that they satisfy the same universal mapping 
property within the same category. (2) One can prove that an object defined in one 
context has a specific property by recasting it in the context of another “solution” 
of the same universal mapping property. (3) One can prove the existence of derived 
endomorphisms and automorphisms of the universal object. 


80Of course such a phrase should be a signal that the teacher is about to become engulfed by the 
urge to present a sermon and that perhaps the listener should surreptitiously head for an exit! But 
how else can a teacher portray a horizon beyond a mere list of specific theorems’. 
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13.11 Basic Multilinear Algebra 


In Sect. 13.5 we proved that R-multibalanced mappings from a finite direct sum of 
symmetric (R, R)-bimodules to a target (R, R)-bimodule factored through the mul- 
tiple tensor product (see Theorem 13.5.2). In this brief subsection we wish to extend 
this theorem to alternating and symmetric forms (the key theorems of multilinear 
algebra), but we shall do this in the case that R is a field so that we can exploit 
universal properties of the symmetric and exterior algebras developed in the pre- 
vious section. As the student shall observe, the device for transferring statements 
about algebras to statements about multilinear mappings depends critically upon our 
discussion of graded algebras and homogeneous ideals. 

First the context: Let V and W be vector spaces over a field F'. We regard both V 
and W as symmetric (F’,, F’)-bimodules—that is av = va for all (a, v) € F x V or 
Fx WwW. 

A mapping 

f:Vx---x V(n factors) ~ W 


is F-multilinear if and only if (1) it is an F-linear mapping of the n-fold direct 
product (and so preserves addition in the direct product) and (2) that for any n-tuple 


of vectors of V, (v1,..., Un), for any field element a € F, and for any index j <n, 
one has 

f(U1L, ..., Uj, Vj41,-.-, Un) = f(U1,..., Uj, AUj41,---, Un) (13.15) 

= (f(vj,..., Un))a. (13.16) 

Such an F-multilinear mapping is said to be alternating if and only if 

f(vu1,..-, Un) = Oif at least two distinct entries among the v; are equal. By consid- 

ering a sequence whose ith and (i + 1)st entries are both v; + vj+1, one sees that for 

an alternating F-multilinear mapping f, the sign of f(v1,..., v,) is changed by the 


transposition of entries in the ith and (i + 1)st positions. It follows that the sign is in 
fact changed by any odd permutation of the entries, but the value of f is preserved 
by any even permutation of the entries. 

An F-multilinear mapping f : V x --- x V(n factors) + W is said to be sym- 
metric if f(v1,...,..., Un) is unchanged under the transposition (v;, vj+1) inter- 
changing the two entries in the ith and i + Ist positions. It then follows that the value 
of f is unchanged by any permutation of its arguments. 


Theorem 13.11.1 (Fundamental Theorem of Multilinear Mappings) Let V be a 
vector space over a field F, viewed as an (F, F)-bimodule so that av = va for all 
(a, v) € F x V. Throughout, we assume that f is an F -multilinear mapping 


f:Vx.--+x V(n factors) — W 


into another (F, F)-bimodule W. The following three statements hold: 
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1. f = f ouwhere f is a homomorphism 


f:T"V)=V®...@V—>W, 
uniquely determined by f. Specifically, the mapping f “factors through f” so 
that 


FOL, 05-06 Un) = f(V1 @ ++ @ wp)., 


for all n-tuples of vectors v; € V. 
2. Now suppose the mapping f is alternating. Then the mapping f factors through 
a unique mapping 


fe: E"V)=V /\--: (\ Vn factors) + W 


so that 
FUL, +5 e005 Un) = fe A v2 A+++ A Un), 


for all n-tuples of vectors v; € V. 

3. Finally suppose the multilinear mapping f is symmetric. Then f factors 
through a unique fs : S"(V) — W where S"(V) is the space of all homo- 
geneous elements of degree n in the graded symmetric algebra S(V). Thus 


f(U1,-.-, Un) = fs(vi--+ Up). 


Proof Part | is just Theorem 13.5.2 with the commutative ring R replaced by a field. 
Part 2. By definition E(V) = T(V)/J where J is the homogeneous ideal given 
at the beginning of Sect. 13.10. We have these direct decompositions: 


TV) =FOVOT*(V)@---O@T'(V)O-:- (13.17) 
IV) =FOATI@®VNDIO--@T'VINDNO-: (13.18) 
E(V)=F@eVe Barwa") nJ)) (13.19) 

=Fevevave: \We-: (13.20) 


The first two summands in Eq. (13.18) are of course zero. Comparing degree 
components in the last two equations yields 


AW) ST"’V/T"VAS). (13.21) 


Now, since f is multilinear, it factors through f :T"(V) > W as in part 1. But 
since f is alternating, we see that ker f contains (T”(V) M J). Thus it follows that 
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f factors through this chain: 


T'(V) > TV) ((T"(V) NS) > IT?) /(T"(V) 0 DDI (ker f/(T"(V) 0 J)) 


T"(V)/ker f = f(T"(V) © W. 


Ie 


(The first two morphisms are natural projections, the next two isomorphisms are 
from the classical homomorphisms theorems, and the last mapping is the inclusion 
relation into W.) Folding in the isomorphism of Eq. (13.21) at the second term and 
composing with the remaining mappings we obtain the desired morphism 


n 
EB AW) > WwW 
through which f (and hence f) factors. That 


Fur, .... Un) = F (v1 @ +++ @ vp) = FECL A+ + A Up) 


follows upon applying the mappings. 

Part 3 follows the same pattern except that we use the ideal /, given at the beginning 
of Sect. 13.10, in place of J. Again one exploits the fact that J is a homogeneous 
ideal to deduce that S"(V) = T"(V)/(T"(V) N I). We leave the student to fill in 
the details in Exercise (1) in Sect. 13.13.8. 


13.12 Last Words 


From here the subject of multilinear algebra begins to blend with another subject 
called Geometric Algebra. Although space does not allow us to pursue the latter 
subject, a hint of its flavor may be suggested by the following. 

Suppose V is both a left vector space, as well as a right vector space over a division 
ring D. Do not assume that V is an (D, D)-bimodule. (This would virtually make 
D a field.) A 2-form is a bi-additive mapping f : V x V — D such that 


f (au, v2) = af (u,v) for all a, 8 € D, u,v, € V. 


If g = af for a fixed scalar a in D, then g is said to be proportional to f. Such a 
2-form f is said to be reflexive if and only if f(u, v) = O implies f(v, u) = 0, and 
is non-degenerate if and only if f(v, V) = 0, implies v = 0, u,v € V. 


Theorem 13.12.1 Suppose dim V > 2 and f : V x V = F is a reflexive non- 
degenerate 2-form, where F is a field. Then f is proportional to a2-form g such that 
for all u,v € V exactly one of the following holds: 


(i) g(u, v) = g(v, u), (symmetric form) 
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(ii) g(u, u) = 0, (alternating of symplectic form) 
(iii) g(u,v) = g(v, u)’, where o is a field automorphism of order 2 (Hermitian 
form). 


The first case can be derived from quadratic forms (see Exercise (9) in 
Sect. 13.13.7). All three cases are the children of a description of 2-forms over divi- 
sion rings—the so-called (c, €)-Hermitian forms (see [2], pp. 199-200 for versions 
of J. Tits’ proof that reflexive forms must take this shape). 

There are many beautiful results intertwining 2-forms and multilinear forms as 
well as widespread geometric applications of such forms. The interested student is 
referred to the classic book Geometric Algebra by E. Artin [1], which, after more 
than fifty years, now begs for an update. 


13.13 Exercises 


13.13.1 Exercises for Sect. 13.2 


1. Leté: M > N beahomomorphism of right R-modules. Recall that the cokernel 
of @ was defined by a universal mapping property on p. 477, in any category for 
which an initial object exists and is also a terminal object. (i) In the category 
of right R-modules, show that the projection 7 : N — N/im®@ satisfies this 
mapping property. (ii) In the category of groups and their homomorphisms, do 
cokernels of homomorphisms exist? If so define it. [Hint: Examine the universal 
mapping property of a cokernel in this category. ] 


2. Given the morphism M SN in the category of right R-modules, define a 
category within which the cokernel of ¢ appears as an initial object. 
3. Let {Ma| a € A} be a family of right R-modules. Recall from p. 260 that if 


P := [J] Mg, then P satisfies the following universal mapping property: First of 
aeA 
all, there are projection homomorphisms 77, : P > Ma, a € A; if P’ is another 


right R-module with homomorphisms 7’, : P’ > Ma, a € A, then there exists 
a unique R-module homomorphism 0 : P’ + P, making the following diagram 
commute for every a € A. 


Now define a category within which the direct product P appears as a terminal 
object. 
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. Let X be a fixed set and define the following category. The objects are the pairs 
(G, 44), where G is a group and w : X — G is a mapping of the set X into 
G. A morphism from the (G, 1) to the pair (G’, ~v’) is a group homomorphism 
@:G—> G' such that pu! = ¢ o ps. Show how the free group F(X) on the set X 
affords an initial object in this category. 

. Rephrase and repeat the above exercise with “groups” replaced with “right R- 
modules”. [Hint: The free module with basis X replaces the free group F(X).] 

. Suppose F : Cj — C2 is acontravariant functor. Show that there are also covariant 
functors F’ : ar —> Cy and F”:C; > Ge. 

. Show that there exists an isomorphism C — (C°%?)°?P which induces the identity 
mapping on objects. (It may move maps, however.) 


13.13.2 Exercises for Sect. 13.3 


1. Let A be an abelian group. If n is a positive integer, prove that 


Z/[nZ ®7, A= A/nA. 


2. Let m,n be positive integers and let d = g.c.d(m, n). Prove that 


Z/mZ @z Z/nZ = Z/dZ. 


3. Let 


€ 


> M” +0 


0> mM’ Sm 


be a split exact sequence of right R-modules, and let N be a left R-module. Show 
that the induced sequence 


1 
0 M' @p N28 MegNn SM" @RpN—>0 


is also split exact. Prove the corresponding statement if the split exact sequence 
occurs as the right hand factor of the tensor product sequence. 

. Prove that if P is a projective left R-module, then P is flat. [Hint: Let u : M’ > M 
be an injective homomorphism of right R-modules. We know that P is the direct 
summand of a free left R-module F, and so there must be a split exact sequence 
of the form 


0O—> P i pAn > 0, 


for some left R-module N. By Exercise 3 we know that both ly @i : M'@rP > 
M'@rF andly@i : M@rP > M@RF are injective. We have the commutative 
square 
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M' @pP 2", M@pP 


yet | | aa 


1 
M ee P 2. Me 


Now what?] 

5. Generalize the proof of Theorem 13.3.8 so as to prove the following. Let R be a 
ring, let M, M’ be right R-modules, and let N, N’ be left R-modules. Let ¢ : 
M —> M' andletw : N > N’ be module homomorphisms. Define K C M@rN 
to be the subgroup generated by simple tensors m @ n where either m € ker ¢ or 
n € ker w. Show that, in fact, K = ker (6Q@wW:M@rN > M @rN’). 


13.13.3 Exercises for Sect. 13.3.4 


1. Let C be a category and let 4» : A — B be a morphism. We say that p is a 
monomorphism if whenever A’ is an object with morphisms f : A’ > A, g: 
A’ — Asuch that po f = og: A’ > B, then f = g: A’ > A. In other 
words, monomorphisms are those morphisms that have “left inverses.” Similarly, 
epimorphisms are those morphisms that have right inverses. Now assume that 
C, D are categories, and that F : C > D, G: D = C are functors, with F 
left adjoint to G. Prove that F preserves epimorphisms and that G preserves 
monomorphisms. 

2. Leti : Z << Q be the inclusion homomorphism. Prove that in the category of 
rings, 7 is an epimorphism. Thus an epimorphism need not be surjective. 

3. Let V, W be F-vector spaces and let V* be the F-dual of V. Prove that there is a 
vector space isomorphism V* @p W = Hom p(V, W). 

4. Let G be a group. Exactly as in Sect. 7.3.2, we may define the integral group ring 
ZG. (These are formal Z-linear combinations of group elements in G.) In this 
way a group gives rise to a ring. Correspondingly, given a ring R we may form its 
group of units U(R). This time, a ring is giving rise to a group. Show that these 
correspondences define functors 


Z : Groups — Rings, U : Rings —> Groups. 


Prove that Z is left adjoint to U. 
5. Below are some further examples of adjoint functors. In each case you are to 
prove that F is left adjoint to G. 


(a) 


Groups —. Abelian Groups & Groups; 


F is the commutator quotient map. 
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(b) 


F G 
Sets —> Groups — Sets, 


where F'(X) = free group on X and G(/7) is the underlying set of the group 
H. 
(c) 


G 
Integral Domains ay Fields — Integral Domains; 


F(D) is the field of fractions of D. (Note: for this example we consider 
the morphisms of the category Integral Domains to be restricted only to 
injective homomorphisms.) 


(d) Fix a field K. 
K — Vector Spaces Le Algebras -°, K — Vector Spaces; 
F(V) = T(V), the tensor algebra of V and G(A) is simply the underlying 
vector space structure of algebra A. 
(e) 


G 
Abelian Groups —, Torsion Free Abelian Groups — Abelian Groups; 


F(A) = A/T(A), where 7 (A) is the torsion subgroup of A. 


Left R-modules ae Abelian Groups a Left R-modules. 


F is the forgetful functor, G = Homz(R, —). 

If G is a group, denote by G Set the category of all sets X acted on by G. 
If X,, X2 are objects in G Set then the morphisms from X, to X2 are the 
mappings in Homg(X1, X2). Let H be a subgroup of G. Then an adjoint 
functor pair is given by 


wm 


(g 


F 1G 
G Set > H Set — G Set 
where, for G-set X, and H-set Y, F(X) = Res% (X), T°(¥) = IndG (Y). 


[The induced action of G on Y x G/H, denoted by the symbol Ind (Y), is 
defined in Exercise 7 of the following Sect. 13.13.4 on p. 519.] 


13.13.4 Exercises for Sect. 13.4 


1. Give a formal proof of Corollary 13.4.3, which asserts that if F C E is an exten- 
sion of fields, and V is an F-vector space with basis {x;}, then as an E-vector 
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space, V @p E has basis {x; @ 1g}. [Hint: Write V = >° x; F and use Corol- 
lary 13.3.6.] 

2. Let T : V — V bea linear transformation of the finite-dimensional F-vector 
space V. Let mr, r(x) denote the minimal polynomial of T with respect to the 
field F. If F C E isa field extension, prove that mr, r(x) = mreiy,z(x). [Hint: 
Apply the previous Exercise 1, above.] 

3. Let W bean F-vector space and let T : Vj — V2 bean injective linear transforma- 
tion of F-vector spaces. Prove that the sequence T@lw : Vi@rW > V2@rWis 
injective. (Note that by Theorem 13.3.8, we’re really just saying that every object 
in the category of F'-vector spaces is flat.) 

4. Let F be a field and let A € Mn, B € M, be square matrices. Define the 
Kronecker (or tensor) product A @ B as follows. If A = [aj], B = [bx], then 
A ® B is the block matrix [Dy], where each entry Dyg (in row p and column q 
of D) is the m x m matrix Dyg = apqB. Thus, for instance, if 


ai a2 bi biz 
A= , B= : 
i a22 bo b22 


then 
aby, ayybi2 ay2bi, ay2b12 
ee = ay1b21 ay1b22 ay2b21 ay2b22 
azib\, a21b12 a22b11 a22b12 
42\b2,  421b22 d22b21 a22b22 
Now Let V, W be F-vector spaces with ordered bases A = (v1, v2,..., Un), 


B= (uw, W2,..., Wm), respectively. Let T : V > V, S: W > W be linear 
transformations with matrix representations 7.4 = A, Sg = B. Assume that 
A ® B is the ordered basis of V @g W given by A@B = (v; ®@ v1, v1 @ 
W2,..., 0] ® Wm} v2 @ wy,..., V2 ® Wm ..., Vy © Wm). Show that the matrix 
representation of T @ S relative to A @ B is given by (T ® S) set, = T4 ® Tz 
(the Kronecker product of matrices given earlier in this exercise) . 

5. Let V be a two-dimensional vector space over the field F, and let 7,5: V —> V 
be linear transformations. Assume the minimal polynomials of S and T are given 
by: mr (x) = (x- a)”, ms(x) = (x—b)*. (Therefore T and S can be represented 
by 2 x 2 Jordan blocks, J2(a), J2(b), respectively.) Compute the invariant factors 
of TOS: V@V > VV. See Sect. 10.7.2, p. 349, for notation and definitions 
concerning minimal polynomials and Jordan blocks.) 

6. Let M be aright R-module, and let J C R be a 2-sided ideal in R. Prove that, as 
right R-modules, 

M ®pr (R/1) = M/MI. 


7. (Induced representations) Here is an important application of the tensor prod- 
uct. Let G be a finite group, let F be a field, and let FG be the F-group ring 
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(see Sect. 7.3). Note that FG is clearly an F’-vector space via scalar multiplication 


a> agg = DY (aag)g. ae F, 


gEG geG 


Likewise, if M is any right FG-module, then M naturally carries the structure of 
an F-vector space by setting a-m := m(ae), a € F,m € M, where e is the 
identity of the group G. Now let H be a subgroup of G and regard FH as a subring 
of FG in the obvious way. Let V be a right FH-module, finite dimensional over 
F, and form the induced module 


IndG (V) := V @rn FG. 


Since FG is an (FH, FG)-bimodule, we infer that, in fact, Ind? (V) is actually a 
right FG-module. Now show that 


dimp Ind§(V) = [G: H]-dimV. 
8. Let A be an abelian group. Prove that a ring structure on A is equivalent to an 


abelian group homomorphism jz : A @z A — A, together with an element e € A 
such that w(e ® a) = (a @ e) = a, for alla € A, and such that 


1@p 
A®zA@zA ~ A®@zA 
H@1 Ll 
Lb 
A@zA m= A 


commutes. (The above diagram, of course, stipulates that multiplication is asso- 
Clative.) 

9. The following theorem is an application of the tensor product to ideal classes in 
the field of fractions of a Dedekind Domain. 


Theorem 13.13.1 Let I and J be fractional ideals in the field of fractions E of a 
Dedekind domain R. If [I] = [J] (that is, I and J belong to the same ideal class in 
E) then I and J are isomorphic as right R-modules. 


Provide a proof of this theorem. (The converse is easy: see Exercise (5) in 
Sect.9.13.5.) [Hint: First, since J has the form al’ for some a € E and ideal 
I’ in R, we have [I] = [/’]. Thus, without loss of generality, we may assume that 
T is anideal of R. Note that since R is an integral domain, the multiplicative identity 
of R is the multiplicative identity 1~ of E. Define the injections iy : 1 > 1 @rE 
andiy: J > J @pr E by 
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iz(r) > r @ 1g, and 
iz(s) > s@le 


forall(r,s)e lx J. 
Now consider the commutative diagram below: 


P@lE 


T@rE E 


J@rE 


iy iy 


where i;, iy are the injections, given above, € : J @r E — E is given by 
e(b@ A) = bX, and where i : J — E is the containment mapping. Note also that 
d@lg: 1 @rE => J @pr E isan E-linear transformation. 

Next, note that if 0 # ap € J, then (ap ® ay ')a = (ap @ lag'a = (ao ® 
a)ag' = (aay ® lag! = (a @ ao)ag! = (a ® lay'ao = a @ 1. Therefore, 
set an := €(6 ® 1)(ap ® do") € E and obtain ¢(a) = «(9 @ l(a @ 1) = 
€(? ® 1)((ao @ do ')a) = €(d ® 1)(an ® dpa =agoa € J.Sinced: I > J is 
an isomorphism, the result follows.] 


13.13.5 Exercises for Sects. 13.6 and 13.7 


1. Show that if R is a commutative ring, and if A is an R-module, then a multipli- 
cation 44: A@pr A — A gives A the structure of an R-algebra if and only if the 
diagrams in Figs. 13.1 and 13.2 (see p. 495) are commutative. 

2. Let A;, Az be commutative R-algebras. Prove that A; ®p Az satisfies a univer- 
sal condition reminiscent of that for direct sums of R-modules. Namely, there 
exist R-algebra homomorphisms ju; : Aj ~ Ai ®r A2, i = 1, 2 satisfying the 
following. If B is any commutative R-algebra such that there exist R-algebra 
homomorphisms ¢; : Aj — B, then there exists a unique R-algebra homomor- 
phism 6 : Aj ®r Az — B such that for i = 1, 2, the diagram 


A, @pr Az 


vi 


commutes. 
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Prove that R[x] @r RLy] = R[x, y] as R-algebras. 
Let F be a field and form F-group algebras for the finite groups G;, G2 as in 
Sect. 7.3. Prove that F[G, x G2] = FG, ®pf FG» as F-algebras. 


13.13.6 Exercises for Sect. 13.8 


1, 


Let A be an R-algebra graded over the nonnegative integers. We say that A is 

graded-commutative if whenever a, € A;, ds € As we have a-ds = (—1)"asa,. 
oe) lo) 

Now let A = @ A,, B = @ B, be graded-commutative R-algebras. Prove that 


r=0 s=0 
there is a graded-commutative algebra structure on A ®pz B satisfying 


(a, ® bs) - (dp ® bq) _ (—1)"?(a-ap ® Dsbq); 


ay € Ay, Ap € Ap, bs € Bs, bg € Bg. (This is usually the intended meaning of 
“tensor product” in the category of graded-commutative R-algebras.) 


13.13.7 Exercises for Sect. 13.10 


ww 


. Assume that the F-vector space V has dimension n. For each r > 0, compute 


the F-dimension of S’(V). 


. Prove Theorem 13.10.1. 
. Prove Corollary 13.10.2. 
. (Determinants) Let T : V — V be a linear transformation, and assume that 


dim V = n. Show that there exists a scalar det(7'), such that 
E"(T) = det T - idgnyy) : E"(V) > E"(V). 
Cite the results in Sect. 13.9 that show, for S, T € Hom(V, V), that 


det(S) - det(T) = det(S o T) = det(T 0 S). 


. Let G — GLf(V) be a group representation on the F-vector space V. Show 


that the mapping G > GLr(E’(V)) given by g +> E’(g) defines a group 
representation on E’(V), r > 0. 


. Let V be a vector space and let v € V. Define the linear map- A v : E’(V) > 


E'*!(V) byw t+ wv. If dim V = n, compute the dimension of the kernel of 
“AU. 


. (Boundary operators) Let V be n-dimensional over the field F’, and having basis 


{v1, U2,..., Un}. For each integer r, 1 <r <n, define the linear transformation 
0, : E'(V) + E'—!(V), from its value on basis elements by setting 
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. 
ac = 
O, (Uj, A Uj, Aves A U;,) = Dev! Vi, A Vig Ave A Df Ave A Vigg 
j=l 


where (>) means delete the factor (-). The mapping 09 : E'(V) = V > 
E°(V) = F is the mapping determined by 09(v;) = 1, i= 1,2,...,n. 


(a) Show that ifr > 1, then 0,10, =0: E"(V) > E"~2(V). 

(b) Define a mapping h, : E’(V) > E't!(V), 0 < r < n—1 by setting 
h,(y) = vj An, where 7 € E’(V). 
Show that if 1 < r < n —1, then 0,41h; + h--10, = idg,y), and that 
hyn-10n = idgency), Oho = id p(y). 

(c) Conclude that the sequence 


O—> Env) % went yt pe? & Biv) 4 Bo) > 0 


is exact. 


8. Let V be an F-vector space. An F-linear mapping 6: E(V) > E(V) is called 
an antiderivation if for allw € E’(V), 7 € E*(V) we have 


d(wAn) = dw) Ant+(-1l)'wA d(n). 


Now let f : V — F bea linear functional, and show that f can be extended 
uniquely to an antiderivation 6 : E(V) > E(V) satisfying 6(v) = f(v)- ley), 
for all v € V. In addition, show that 6 satisfies 


(a) 6: E'(V) > E’"'(V), r= 1, 6: Eo(V) > {0}. 
(b) 6*=0: E(V) > E(V). 


9. (The Clifford Algebra) Let V be an F-vector space and let 0: V > F bea 
function. We call Q a quadratic form if Q satisfies 


(a) O(av) = a*Q(v), for alla € F, v € V, and 
(b) the mapping B: V x V —> F given by 


(v,w)r> Ou + w) — Ov) — Ow) 


defines a (clearly symmetric) bilinear form on V. 


Given the quadratic form Q on V, we define a new algebra, C(Q), the Clifford 
algebra, as follows. Inside the tensor algebra T(V) define the ideal J to be 
generated by elements of the form v ® v — Q(v)- Ir, v € V. (Note that J is 
not a homogeneous ideal of T(V).) The Clifford algebra is the quotient algebra 
C(Q) = T(V)/I. 

Show that C(Q) satisfies the following universal criterion. Assume that C’ is 
an F-algebra and that there exists a linear mapping f : V > C’ such that for 
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allu € V, (f (v))? = Q(v)- lc. Prove that there exists a unique F'-algebra 
homomorphism 7 : C(Q) — C’ making the following triangle commute: 


C(Q) 


Vv SC: 
(In the above diagram, 1 : V — C(Q) is just the natural homomorphism v +> 
v+T/,.) 

10. Let V be an n-dimensional F-vector space. If d < n, define the (n, d)- 
Grassmann space, Gqa(V) as the set of all d-dimensional subspaces of V. In 
particular, if d = 1, the set Gj (V) is more frequently called the projective space 
on V, and is denoted by P(V).? We define a mapping 


b:Ga(V) — P(E*(V)), 


as follows. If U € Ga(V), let {uy,..., ug} be a basis of U, and let @(U) be the 1- 
space in P(E“(V)) spanned by u} A---Aug. Prove that 6 : Ga(V) > P(E4(V)) 
is a well-defined injection of Gg(V) into P(E ery yy. (This mapping is called the 
Pliicker embedding.) 

11. Let V be an n-dimensional vector space over the field F. 


(a) Show that if 1 < d <n —1, then the Pliicker embedding ¢ : Gy_1(V) — 
P(E”~!(V)) is never surjective. [Hint: Why is it sufficient to consider only 
(d,n) = (2, 4)?] 

(b) If F isa finite field F,, show that the Pliicker embedding @ : G,_|(V) —> 
P(E"—!(V)) is surjective. This implies that every element of z € E”~!(V) 
can be written as a “decomposable element” of the form z = vj A v2 A 
-++A Un,—1 for suitable vectors v1, v2,..., Un—1 € V. [An obvious counting 
argument.] . 


12. Let V, W be F-vector spaces. Prove that there is an isomorphism 


QD EV) @ EW) — EV OW). 
i+j=r 


° In geometry, the terms “Grassmann space”, and “Projective space” not only include the “points”, 
Ga(V) and G;(V), respectively, but also include collections of “lines” as well. In the case of the 
projective space, the lines are G2(V), while, for the Grassmann space Ga(V), | < d <n, the lines 
are all pairs (X, Y) € Gg_1(V) x Ga+1(V) such that X C Y. The “points” belonging to (X, Y) are 
those Z € Gy(V) such that X C Z C Y. These lines correspond to 2-dimensional subspaces of 
E4(V). 
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13. 


14. 


15. 


16. 


Let V be an F-vector space, where char F ¥ 2. Define the linear transformation 
S:V@V—V OV by setting S(v ®@w) =w @v. 


(a) Prove that S has minimal polynomial ms5(x) = (x — 1)(x + 1). 

(b) If Vj = ker(S — 7), V_; = ker(S + J), conclude that V @ V = V; @ V_1. 

(c) Prove that V; = S?(V), V_1 = E*(V). 

(d) If T : V — V is any linear transformation, prove that V; and V_; are 
T ® T-invariant subspaces of V @ V. 


Let V be an n-dimensional F'-vector space. 


(a) Prove that E(V) is graded-commutative in the sense of Exercise (1) in 
Sect. 13.13.6. 

(b) If {Z;}, i = 1,...,7, is a collection of one-dimensional subspaces of V 
which span V, prove that as graded-commutative algebras, 


E(V) = E(L1) @ E(L2) @- + @ E(Ln) 


Let G = GL(V) act naturally on the n-dimensional F-vector space V so that V 
becomes a right FG-module.. 


(a) Show that the recipe g(f) = det g- fog7'!, gé G, f € V* = Hom(V, F), 
defines a representation of G on V*, the dual space of V. 

(b) Show that in the above action, G acts transitively on the non-zero vectors of 
v*. 

(c) Fix any isomorphism E”V = F; show that the map E”~!(V) > V* given 
by w+ w A- is a morphism of right FG-modules, i.e., the map commutes 
with the action of G. 

(d) A vector z € E“(V) is said to be decomposable or pure if and only if it 
has the form z = vj A --- A vg for suitable vectors v; € V. Since G clearly 
acts on the set of decomposable vectors in E "—-1(V), conclude from (b) that 
every vector in E”~! is decomposable. 


(Veronesean action) Again fix a vector space V with finite basis X = {x1,..., Xn} 
with respect to the field F. 


(a) Lef f : V > F be a functional (that is, a vector of the dual space V%*). 
Show that S¢(f) is a linear mapping S4(V) — F, and so is a functional of 
S#(V), lsd =n: 

(b) Set G = GL(V). First let us view V as a left G-module. Thus for all 
S,T € Gandv € V, we have (ST)v = S(T(v)). If f € Hom(V, F) and 
T € G, then f oT is also a functional. Show that for all (f, T) € V* x G, 
the mapping 

fo foT 


defines a right action of G on V* (i.e. V* becomes a right FG-module). 
[Hint: One must show that fo (ST) = foSoT.One can check these maps 
at arbitrary v € V recalling that the operators T, S and f operate on V from 
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the left.] (By way of comparison with part (a) of the previous exercise, note 
that we have defined a representation of G on the dual space V*, without 
employing the determinant.) 

For each f € Hom(V, F), and integer d > 1, S¢(f) : S4(V) > Fisa 
functional of S4(V) called a Veronesean vector. With respect to the basis 
X = {xj}, we may regard S4(V) as the space of homogeneous polynomi- 
als of degree d in F[x1,..., Xn]. If f(xi) = «| € F, show that at each 
homogeneous polynomial p € S“(V), 


(pO, .--,%n)) = pleat,.--,&) € F. 
conclude from this, that if (a, f) € F x V*, then 
S4(af) = a4S4(f). (13.22) 


Describe how this produces a bijection between the 1-dimensional sub- 
spaces of V* (the points of the projective space P(V*)) and the 1-spaces of 
S4(V)* spanned by the individual Veronesean vectors. (The latter collection 
of 1-dimensional subspaces is called the (projective) Veronesean variety of 
degree d.) [Hint: Use Eq. (13.22).] 

Prove that the (right) action of G = GL(V) on V* described in part (b) of 
this exercise transfers to an action on G on S4@ (V)* which is transitive on 
the non-zero Veronesean vectors it contains. It also induces a permutation 
isomorphism between the action of G on the projective points of P(V*) 
and the action of G on the projective Veronesean variety of degree d. [Hint: 
Citing relevant theorems, show that if (f, T) € V* x G then haa @ oT)= 
S4(f) o S4(T), so that 


p(T): S4(f) > S4(f) 0 S4(T) 


defines a mapping 
p! : G > Endr(S“(V)*) 


which describes this action.] 


17. Attempt to emulate the development of the previous exercise with E(V) replac- 


ing S(V) everywhere. Thus for | < d <n, one desires an injective mapping 


&: P(V*) > P(E4(V*)) 


by transferring functional f of V* to E4(f) : E4(V) — F. Explain in detail 
what goes wrong. 
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13.13.8 Exercise for Sect. 13.11 


1. Write out a proof of part 3 of Theorem 13.11.1 
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A 
Abelian group, 88 

divisible, 268 

torsion subgroup, 362 
Absolute value, 424 
ACC, see ascending chain condition 
Adjoint 

left, 488 

right, 488 
Algebra, 493 

Clifford, 523 

exterior, 504 

graded commutative, 522 

monoid ring, 199 

of ascending chains, 508 

symmetric, 504 

tensor, 503 
Algebraic independence, 412 
Algebraic integer, 255, 316 
Algebraic interval, 239 
Alternating group, 111 
Annihilator 

of a module, 271 

right, 196 
Antiautomorphism 

of a ring, 190 
Antichain, 32 
Antiderivation, 523 
Ascending chain condition, 42 
Association class, 281 
Associative law, 74 
Atom 

in a poset, 47, 69 
Automorphism 

of a group G, 81 
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of a graph, 76 
of a ring, 190 
Automorphism group, of a group, 81 


B 
Baer’s criterion, 277 
Balanced map 
multiple arguments, 491 
Balanced mapping, 479 
Basis 
of a free module, 335 
of a module, 243 
Bell numbers, 36, 136 
Bilinear form 
symmetric, 315 
Bimodule, 234 
symmetric, 234 


Cc 
Canonical form, rational, 348 
Cardinal number, 16, 18 
Cartesian product 
of n sets, 5 
of two sets, 5 
Category, 473 
initial object, 476 
isomorphism, 476, 478 
morphism, 473 
object, 473 
opposite, 477 
subcategory, 474 
terminal object, 476 
Cauchy sequence, 426 
convergence of, 426 
Cayley graph, 164 
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Center 

of a ring, 188 
Centralizer 

in rings, 188 
Centralizer in a group, 101 
Centrum, see center 
Characteristic, 220 

positive, 220 
Characteristic polynomial, 345, 350 
Characteristic subgroup, 90 
Characteristic zero, 220 
Chinese Remainder theorem, 224 
Clifford algebra, 523 
Closure operator, 38 
Cofree R-module, 270 
Cokernel, 515 
Comaximal ideals, 224 
Commutator 

in a group, 140 

in groups, 140 

triple, 140 
Commutator subgroup, 142 
Companion matrix, 347 
Complement 

of a subgroup, 153 
Completion, 225, 426 
Complex numbers, 211 
Composition factors, 139 
Composition series, 58, 139 

finite, 248 
Compositum, 415 
Concatenation, 199, 475 
Conjugacy class, 107 
Conjugate 

of a group element, 107 

of a group subset, 107 

of a subgroup, 107 
Conjugation 

by a group element, 87 
Convolution, 200 
Countably infinite, 18 
Cover, 239 
Cycle 

in a graph, 64 
Cyclic group, 75 


D 
Dedekind 

Independence Lemma, 381 
Dedekind domain, 305 
Degree 

of a polynomial, 206 


Index 


Dependence, 60 

algebraic, 411 
Dependence relation, 59, 60 

dimension, 63 

flat, 61 

independent set, 61 

spanning set, 61 
Derivative, formal, 374 
Derived length, 142 
Derived series, 142 
Descending chain condition, 44 
Dihedral group, 75 
Dimension 

in a dependence relation, 63 

of a vector space, 246 
Direct product 

of groups, 95 
Direct sum 

of groups, 95 
Dirichlet algebra, 216, 494 
Discrete valuation ring, 321 
Discriminant, 393, 394 
Divides, 3 
Divisibility, 280 

associates, 280 
Divisibility poset, 281 
Divisible abelian group, 268 
Division ring, 187 
Domain 

Bezout, 351 

principal ideal, 351 
Dual basis, 315 


E 
Eigenroot 

of a linear transformation, 350 
Eigenvector, 350 
Eisenstein integers, 287 
Eisenstein irreducibility criterion, 318 
Eisenstein numbers, 189, 218, 327 
Elementary column operations, 336 
Elementary divisors, 339 
Elementary row operations, 336 
Embedding of groups, 80 
Endomorphism 

of an additive group, 232 

of groups, 81 
Endomorphism ring, 213, 444 
Epimorphism 

of modules, 237 
Equation 

root of, 357 


Index 


Equivalence relation, 6 
Euclidean domain, 285 
Euler phi-function, 189 
Euler totient function, 391 
Evaluation homomorphism, 203 
Exterior algebra, 504 

pure vectors of, 525 


F 
Factor module, 236 
Factor ring, 192 
Factor system, 153 
Factorization, 283 
proper, 283 
Factorization length, 336 
Farey series, 35 
Field 
characteristic of, 356 
composite of, 415 
definition, 356 
generated subfield, 356 
of fractions, 300 
of Laurent series, 436 
prime subfield of, 356 
quadratic, 326 
splitting, 365 
subfield, 356 
Field extension, 357 
algebraic element of, 357 
algebraic independence in, 412 
degree of, 357 
Galois, 384 
normal, 368 
purely inseparable, 377 
radical, 400 
separable, 376 
separable closure, 379 
transcendental, 357, 407 
Filter, 31 
principal, 31 
Flat, 486 
generated by a set, 61 
in a matroid, 61 
Forest 
in a graph, 64 
Formal derivative, 374 
Formation of groups, 145 
Fours group, 75 
Fractional ideal, 309 
principal fractional ideal, 309 
Frame, 167 
Frattini element, 48 
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Frattini subgroup, 145 
Free group, 169 
Free R-module, 274 
Free module, 244 

rank, 246, 335 
Free monoid, 166 
Function 

rational, 407 
Functor 

covariant, 478 

isomorphism, 478 


G 
Galois connection, 38, 210 
Galois E., 401 
Galois group 
of a polynomial, 392 
Gaussian integers, 189, 218, 287 
Generators and relations, 172 
Graded algebra 
homomorphism of, 499 
Graded commutative, 522 
Graded-commutative algebra, 522 
Grading of an algebra, 497 
Graph 
automorphism of, 76 
Cayley, 164 
connected, 474 
edges, 474 
homomorphism, 165 
simple, 164, 474 
vertices, 474 
walk, 474 
Grassmann space, 524 
Greatest common divisor, 4, 282, 334 
Group 
abelian, 88 
alternating, 111 
automorphism of, 81 
commutator subgroup, 140 
cyclic, 75 
definition of, 74 
dihedral, 75 
embedding of, 80 
endomorphism of, 81 
epimorphism, 80 
fours group, 75, 95 
generalized dihedral, 98 
generators of, 164 
homomorphism, 80 
kernel of, 81 
inner automorphism of, 87 
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isomorphism of, 81 
locally cyclic, 362 
nilpotent, 146 
order of, 77 
p-nilpotent, 161 
presentation of, 172 
simple, 137 
solvable, 142 
symmetric, 75, 105 
Group action, 106 
equivalence of, 114 
faithful, 106 
k-fold transitivity, 127 
k-homogeneous, 127 
multiply transitive, 127 
primitive, 124 
rank of, 124 
regular, 115 
subdegrees, 125 
transitive, 106 
Group algebra, 495 
Group extension, 98 
split, 98 
Group of units, 517 
Group representation, 102 
Group ring, 202 
Groups 
direct product of, 95 
direct sums of, 95 


H 
Hereditary, 276 
Hom 

as a functor, 262 


definition for modules, 261 


Homogeneous 
element, 497 
summand, 497 


Homogeneous component, 447, 449 


Homogeneous ideal, 499 
Homomorphism 

of graphs, 165 

of rings, 189 

of R-modules, 237 


I 

Ideal 
two-sided, 191 
homogeneous, 499 
left, 191 
nilpotent, 449 


P-primary, 225 
primary, 225 
prime, 196, 450 
primitive, 462 
principal, 211, 285 
products of, 306 
radical, 224 
right, 191 
Ideal class group, 309 
Ideals 
comaximal, 224 
product of, 449 
Idempotent, 452 
Identity morphism, 474 
Image 


of module homomorphism, 238 


Image poset, 33 
Imprimitivity 

system of, 123 
Independent set 

in dependence theory, 61 
Indeterminate, 202 
Induced matroid, 64 
Induced module, 520 
Initial object, 476 
Injective module, 267 
Inner automorphism 

of a group, 87 
Integral domain, 187, 218 

characteristic of, 220 

Euclidean, 285 

field of fractions, 300 

PID, 289 


principal ideal, 283, 285, 334 


quadratic, 325 
UEFD, 289 


unique factorization, 289 


Integral elements 

ring of, 314 
Integral group ring, 517 
Integral over a domain, 255 
Integrally closed, 302 
Interval 

algebraic, 41 

height of, 42 
Invariant factors, 339 


of a linear transformation, 345 


Invariant subgroups, 90 
Inverse image, 101 
Invertible ideal, 310 
Involution, 77 
Irreducible 


element in a domain, 283 


Index 


Index 


Irreducible action 

of a linear transformation, 350 
Irreducible element, 289 
Isomorphism, 476 

of groups, 81 

of modules, 237 


J 
Join 

in a poset, 45 
Jordan block, 349 


Jordan canonical form, of a matrix, 349 


Jordan decomposition, 343 


K 
Kernel, 477 

of a group homomorphism, 81 

of a module homomorphism, 238 
Kronecker product, 519 
Krull-Remak-Schmidt theorem, 252 


L 
Lattice, 46 
complete, 46 
modular, 54, 247 
Laurent series, 436 
Least common divisor, 3 
Least common multiple, 282, 334 
Left adjoint, 488 
Leibniz rule, 374 
Limit 
of a convergent sequence, 426 
Linear combination, 243 
Linear dependence 
right, 245 
Linear independence, 243 
Linear transformation 
characteristic polynomial of, 345 
invariant factors of, 345 
minimal polynomial of, 345 
nullity of, 350 
rank of, 350 
Local ring, 300, 434 
residue field of, 434 
Localization 
at a prime ideal, 300, 312 
by S, 297 
Lower central series, 146 
Lower semilattice, 46 
semimodular, 50 


M 
Mappings 
bijection, 9 
codomain of, 8 
containment mapping, 8 
domain of, 8 
equality of, 8 
extension of, 8 
identity mapping, 8 
injective, 9 
inverse of a bijection, 9 
one-to-one, 9 
onto, 9 
restriction of, 8 
surjective, 9 
Matrix 
nilpotent, 351 
R-invertible, 228 
Matroid, 63 
induced, 64 
Maximum condition, 43 
Measure 
of a poset interval, 51 
Meet 
in a poset, 45 
Minimal polynomial, 345 
Minimum condition, 44 
Mo6bius function, 216 
M6bius inversion, 216 
Modular lattice, 54 
Modular law, 54 
Module 
Artinian, 247 
basis, 243 
bimodule, 234 
cofree, 270 
completely reducible, 445 
cyclic, 350 
direct product, 241 
direct sum, 241 
internal, 242 
dual of, 262 
factor module of, 236 
faithful, 232 
finitely generated submodule, 236 
flat, 486 
free, 244, 274 
homomorphism, 237 
epimorphism, 237 
isomorphism, 237 
kernel of, 238 
monomorphism, 237 
indecomposable, 250 
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injective, 267 

internal direct sum, 242 

irreducible, 239, 350 

left, 231 

Noetherian, 247 

p-primary part of, 340 

prime, 462 

projective, 265 

right, 232 

submodule, 235 

submodule generated by, 235 

sum of submodules in, 236 

torsion, 340 

torsion submodule, 339 
Module homomorphism 

cokernel, 515 
Monoid, 13, 198 

commutative, 51 

free, 166 

free commutative, 199 

monoid ring, 199 
Monomorphism, 517 

of modules, 237 
Morphism, 473 
Multilinear mapping, 512 

alternating, 512 

symmetric, 512 
Multiplicity 

of zeros, 361 
Multiset, 36 

as a monoid, 51 

finite, 37 

poset of, 37 

poset of, 37 
Multitensor 

pure, 492 


N 
Natural numbers, 18 
Nilpotent 

matrix, 351 
Nilpotent element, 225 
Non-generator, 151 
Norm 

of a complex number, 215, 291 
Norm map, 416 
Norm mapping 

in quadratic field, 327 
Normal set, 102 
Normal subgroup, 90 
Normalizer 

of a group subset, 116 


Normalizer in a group, 102 
Nullity 

of a linear transformation, 350 
Number 

cardinal, 18 

natural, 18 


O 
Object, 473 
Opposite rings, 213 
Orbit, 106 
length, 106 
Orbital, 125 
diagonal, 125 
symmetric, 125 
Order 
of a group element, 77 
Order ideal, 30 
principal, 31 
Order of a group, 77 
Overring, 312 


P 
Partial transversal, 64 
Partition, 5, 67 
component of, 36, 67 
finitary, 67 
refinement of, 36 
Partition function, 344 
Partition of a number, 344 
Permutation, 105 
cycle, 108 
even and odd, 111 
finitary, 107 
transposition, 110 
Phi-function, 189 
PID, 283, 285 
Pliicker embedding, 524 
p-nilpotent, 161 
Polynomial 
cyclotomic, 391 
degree, 203 
leading coefficient of, 319 
multiplicity of zero, 361 
separable over a field, 374 
symmetric, 396 
zero of, 209, 357 
Polynomial algebra, 495 
Polynomial ring, 202, 226 
over a field, 289 
Polynomials 


Index 


Index 


elementary symmetric, 396 

power sum, 396 
Poset, 22 

antichain, 32 

chain, 26, 41 

direct sum, 33 

dual, 24 

filter, 31 

image, 33 

interval, 24 

isomorphism classes, 33 

lattice, 46 

order ideal, 30 

product, 32 

totally ordered, 25 

well-ordered, 26, 34 
Poset mappings 

closure operator, 38 

embedding, 33 

Galois connection, 38 

isomorphism, 33 

monotone non-decreasing, 37 

order-reversing, 38 
Power set, 7 
P-primary ideal, 225 
Primary ideal, 225 
Prime element, 284, 289 
Prime ideal, 300 
Prime ring, 450 
Primitive 

group action, 124 
Primitive ideal, 462 
Primitive polynomial, 293 
Primitive roots of unity, 391 
Principal fractional ideal, 309 
Principal ideal, 211 
Principal ideal domain, 218, 283, 285 
Product 

point-wise, 212 
Projection homomorphism, 238 
Projection morphism 

in a direct sum, 95 
Projective module, 265 
Projective space, 524 
p-Sylow subgroups, 118 
Pure tensors, 481 


Q 

Quadratic domain, 325 
Quadratic field, 326 
Quadratic form, 523 
Quasi-inverse, 466 


Quasiregular element, 466 
Quasiregular ideal, 466 
Quaternions, 214 


R 

Radical, 48 
of a module, 444 
prime, 450 

Rank, 246 
of a free module, 335 
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Rational canonical form, of a matrix, 348 


Reduced word 
in a free group, 166, 172 
Refines 
a chain, 41 
a partition, 36 
Relatively prime, 4, 334 
Relatively prime ideals, 224 
Residual quotient 
left , 196 
right, 196 
Right adjoint, 488 
Ring 
commutative, 187 
completely reducible, 452 
definition, 186 
division, 187 
factor ring of, 192 
field, 187 
group ring, 202 
hereditary, 276 
homomorphism of, 189 
ideal of, 191 
integral domain, 187 
left Noetherian, 253 
local, 300, 434 
residue field of, 434 
monoid ring, 199 
of integral elements 
classical, 314 
of matrices, 214 
opposite of, 213 
polynomial ring, 202 
prime, 450 
primitive, 452 
right Artinian, 273 
right ideal of, 191 
right Noetherian, 253, 273 
semiprime, 450 
semiprimitive, 452 
simple, 195 
subring of, 187 


538 


Ring homomorphism 
kernel of, 192 
Ring left ideal of, 191 


N) 
Scalar multiplication, 231 
Schréder-Bernstein theorem, 62 
Semidirect product, 96 
Semigroup, 13 
Semilattice 
Frattini element, 48 
lower, 46 
radical, 48 
socle, 49 
upper, 46 
Semimodular, 50 
Semiprime ring, 450 
Separable 
closure of a set, 379 
field element, 376 
field extension, 376 
polynomial, 374 
Sequence 
bounded, 428 
convergent, 426 
exact, 263 
lower bound of, 429 
null, 426 
short exact, 263 
upperbound of, 428 
Sets 
cardinality of, 16 
intersection of, 5 
power sets of, 7 
union of, 5 
Short exact sequence, 263 
split, 264 
Simple group, 99, 102 
Socle, 49 
Solvable group, 142 
Spanning set, 61 
Spans in modules, 243 
Split extension 
of groups, 153 
Split extension of groups, 98 
Subgroup 
centralizer, 101 
commutator, 142 
definition of, 77 
derived, 142 
Frattini, 152 
maximal, 102 


maximal normal, 99, 102 
normal, 90 
normalizer, 102 
p-Sylow, 118 
second derived, 142 
subnormal, 137 
Subgroup criterion, 78 
Submodule 
finitely generated, 236 
Subposet 
induced, 23 
Subring, 187 
Sum 
point-wise, 212 
Sylow subgroup, 118 
Symmetric algebra, 504 
Symmetric groups, 75 


T 
Tensor algebra, 503 
Tensor product, 480 
Terminal object, 476 
Three subgroups lemma, 141 
Torsion element, 339 
Torsion submodule, 339 
Tower of fields, 357 
Trace mapping 

in a quadratic field, 327 
Tracelike linear forms, 315 
Transcendence basis, 412 
Transcendence degree, 412 
Transcendental element, 407 
Transitive, 106 
Transpose, 191 

o-transpose, 191 
Transposition, 110 
Transversal 

in groups, 152 
Trees 

in a graph, 64 


U 

UFD, 289 

Unit of a ring, 188 

Universal mapping property, 476 
Upper central series, 147 

Upper semilattice, 46 


Vv 
Valuation, 424 
archimedean, 424 
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Index 


non-archimedean, 425 
trivial, 425 
Valuation ring, 320, 433 
Vandermonde matrices, 362 
Vector, 245 
Vector space 
dimension, 246 
right, left, 245 
Veronesean 
as projective variety, 526 
vector, 526 


WwW 
Walk, 474 
Well-ordered property, 26, 34 


Z 
Zeroes 

of a polynomial, 209 
Zeta function, 216 
Zorn’s Lemma, 26 
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